2025-12-04T12:52:51.7474639Z Current runner version: '2.329.0' 2025-12-04T12:52:51.7477527Z Runner name: 'linux.rocm.gpu.gfx942.4.b-bphpw-runner-rpncb' 2025-12-04T12:52:51.7477919Z Runner group name: 'default' 2025-12-04T12:52:51.7478320Z Machine name: 'linux' 2025-12-04T12:52:51.7479432Z ##[group]GITHUB_TOKEN Permissions 2025-12-04T12:52:51.7480548Z Contents: read 2025-12-04T12:52:51.7480791Z Metadata: read 2025-12-04T12:52:51.7481032Z ##[endgroup] 2025-12-04T12:52:51.7482061Z Secret source: Actions 2025-12-04T12:52:51.7482346Z Prepare workflow directory 2025-12-04T12:52:51.7718075Z Prepare all required actions 2025-12-04T12:52:51.7737663Z Getting action download info 2025-12-04T12:52:52.1841983Z Download action repository 'pytorch/pytorch@main' (SHA:a2b5dfb956aed182f6aefce1ff2eda70c35049e1) 2025-12-04T12:52:56.0941395Z Download action repository 'pytorch/test-infra@main' (SHA:39aa74d619174326f4e2fb0e216151c2f29d9ffd) 2025-12-04T12:52:57.1823833Z Download action repository 'actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T12:52:58.0777025Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-12-04T12:52:58.9545176Z Getting action download info 2025-12-04T12:52:59.1483589Z Download action repository 'actions/checkout@v4' (SHA:34e114876b0b11c390a56381ad16ebd13914f8d5) 2025-12-04T12:52:59.9467994Z Getting action download info 2025-12-04T12:53:00.1583330Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-12-04T12:53:00.9520617Z Getting action download info 2025-12-04T12:53:01.1362758Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/main (ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32) 2025-12-04T12:53:01.1364894Z ##[group] Inputs 2025-12-04T12:53:01.1365063Z build-environment: linux-jammy-rocm-py3.10 2025-12-04T12:53:01.1368430Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T12:53:01.1371861Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:01.1372170Z sync-tag: 2025-12-04T12:53:01.1372733Z timeout-minutes: 300 2025-12-04T12:53:01.1372855Z tests-to-include: 2025-12-04T12:53:01.1372964Z dashboard-tag: 2025-12-04T12:53:01.1373202Z disable-monitor: true 2025-12-04T12:53:01.1373335Z monitor-log-interval: 5 2025-12-04T12:53:01.1373465Z monitor-data-collect-interval: 1 2025-12-04T12:53:01.1373599Z ##[endgroup] 2025-12-04T12:53:01.1373819Z Complete job name: linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:53:01.1689992Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-12-04T12:53:01.1690354Z with: 2025-12-04T12:53:01.1690452Z no-sudo: true 2025-12-04T12:53:01.1690556Z submodules: recursive 2025-12-04T12:53:01.1690660Z fetch-depth: 0 2025-12-04T12:53:01.1690859Z env: 2025-12-04T12:53:01.1690962Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:01.1691090Z ##[endgroup] 2025-12-04T12:53:01.1736568Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T12:53:01.1736958Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T12:53:01.1744526Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:01.1744683Z env: 2025-12-04T12:53:01.1744778Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:01.1744881Z ##[endgroup] 2025-12-04T12:53:01.1918513Z ##[group]Run actions/checkout@v4 2025-12-04T12:53:01.1918706Z with: 2025-12-04T12:53:01.1918826Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:01.1918966Z fetch-depth: 0 2025-12-04T12:53:01.1919063Z submodules: recursive 2025-12-04T12:53:01.1919255Z show-progress: false 2025-12-04T12:53:01.1919364Z repository: pytorch/pytorch 2025-12-04T12:53:01.1919535Z token: *** 2025-12-04T12:53:01.1919624Z ssh-strict: true 2025-12-04T12:53:01.1919717Z ssh-user: git 2025-12-04T12:53:01.1919807Z persist-credentials: true 2025-12-04T12:53:01.1919915Z clean: true 2025-12-04T12:53:01.1920026Z sparse-checkout-cone-mode: true 2025-12-04T12:53:01.1920146Z fetch-tags: false 2025-12-04T12:53:01.1920302Z lfs: false 2025-12-04T12:53:01.1920393Z set-safe-directory: true 2025-12-04T12:53:01.1920497Z env: 2025-12-04T12:53:01.1920582Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:01.1920689Z ##[endgroup] 2025-12-04T12:53:01.2460442Z Syncing repository: pytorch/pytorch 2025-12-04T12:53:01.2461026Z ##[group]Getting Git version info 2025-12-04T12:53:01.2461192Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T12:53:01.2461444Z [command]/usr/bin/git version 2025-12-04T12:53:01.2461562Z git version 2.52.0 2025-12-04T12:53:01.2463650Z ##[endgroup] 2025-12-04T12:53:01.2468156Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/73daeebe-5ee8-442c-854f-7e8cd805ec30/.gitconfig' 2025-12-04T12:53:01.2473233Z Temporarily overriding HOME='/home/runner/_work/_temp/73daeebe-5ee8-442c-854f-7e8cd805ec30' before making global git config changes 2025-12-04T12:53:01.2473848Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T12:53:01.2476074Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T12:53:01.2505207Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T12:53:01.2524848Z https://github.com/pytorch/pytorch 2025-12-04T12:53:01.2533219Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T12:53:01.2535437Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T12:53:01.2556734Z refs/heads/main 2025-12-04T12:53:01.2568578Z [command]/usr/bin/git checkout --detach 2025-12-04T12:53:02.9327440Z HEAD is now at 685ba6bc0117 add back legalize_graph for BC reason (#169541) 2025-12-04T12:53:02.9401096Z [command]/usr/bin/git branch --delete --force main 2025-12-04T12:53:02.9562257Z Deleted branch main (was 685ba6bc0117). 2025-12-04T12:53:02.9565752Z ##[endgroup] 2025-12-04T12:53:02.9569469Z [command]/usr/bin/git submodule status 2025-12-04T12:53:02.9806459Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T12:53:02.9872833Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T12:53:02.9928039Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T12:53:02.9987151Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T12:53:03.0024691Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T12:53:03.0079348Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T12:53:03.0387079Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T12:53:03.0415154Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T12:53:03.0437633Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T12:53:03.0494508Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T12:53:03.0590912Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T12:53:03.0688127Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T12:53:03.0723629Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T12:53:03.0791544Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T12:53:03.0811080Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T12:53:03.0870521Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T12:53:03.0903593Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T12:53:03.1135906Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T12:53:03.1221799Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T12:53:03.1307484Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T12:53:03.1440089Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T12:53:03.1495744Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T12:53:03.1546213Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T12:53:03.1689270Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T12:53:03.1713773Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T12:53:03.1728066Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T12:53:03.1746958Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T12:53:03.1952208Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T12:53:03.1967290Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T12:53:03.1986658Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T12:53:03.2202700Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T12:53:03.2245439Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T12:53:03.2287341Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T12:53:03.2310252Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T12:53:03.2371322Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T12:53:03.2417343Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T12:53:03.2469152Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T12:53:03.2481779Z ##[group]Cleaning the repository 2025-12-04T12:53:03.2486715Z [command]/usr/bin/git clean -ffdx 2025-12-04T12:53:03.2611244Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T12:53:03.3352997Z HEAD is now at 685ba6bc0117 add back legalize_graph for BC reason (#169541) 2025-12-04T12:53:03.3422911Z ##[endgroup] 2025-12-04T12:53:03.3424934Z ##[group]Disabling automatic garbage collection 2025-12-04T12:53:03.3428587Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T12:53:03.3460538Z ##[endgroup] 2025-12-04T12:53:03.3460802Z ##[group]Setting up auth 2025-12-04T12:53:03.3463912Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T12:53:03.3487267Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T12:53:03.3701997Z Entering 'android/libs/fbjni' 2025-12-04T12:53:03.3736907Z Entering 'third_party/FP16' 2025-12-04T12:53:03.3763330Z Entering 'third_party/FXdiv' 2025-12-04T12:53:03.3792919Z Entering 'third_party/NNPACK' 2025-12-04T12:53:03.3822166Z Entering 'third_party/NVTX' 2025-12-04T12:53:03.3847961Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:03.3875809Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:03.3904045Z Entering 'third_party/aiter' 2025-12-04T12:53:03.3931797Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:03.3979725Z Entering 'third_party/benchmark' 2025-12-04T12:53:03.4002447Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:03.4031377Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:03.4072569Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:03.4105670Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:03.4143237Z Entering 'third_party/cutlass' 2025-12-04T12:53:03.4176022Z Entering 'third_party/fbgemm' 2025-12-04T12:53:03.4211917Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:03.4248976Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:03.4282374Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:03.4319023Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:03.4355837Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:03.4384878Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:03.4417348Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:03.4447998Z Entering 'third_party/flash-attention' 2025-12-04T12:53:03.4475416Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:03.4502659Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:03.4542210Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:03.4579366Z Entering 'third_party/fmt' 2025-12-04T12:53:03.4606047Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:03.4633565Z Entering 'third_party/gloo' 2025-12-04T12:53:03.4658840Z Entering 'third_party/googletest' 2025-12-04T12:53:03.4699949Z Entering 'third_party/ideep' 2025-12-04T12:53:03.4725640Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:03.4764638Z Entering 'third_party/ittapi' 2025-12-04T12:53:03.4793599Z Entering 'third_party/kineto' 2025-12-04T12:53:03.4822705Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:03.4846489Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:03.4882738Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:03.4912685Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:03.4938486Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:03.4968021Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:03.4998833Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:03.5023101Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:03.5050913Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:03.5087958Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:03.5115598Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:03.5142108Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:03.5178649Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:03.5208409Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:03.5230340Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:03.5268947Z Entering 'third_party/kleidiai' 2025-12-04T12:53:03.5302880Z Entering 'third_party/mimalloc' 2025-12-04T12:53:03.5329086Z Entering 'third_party/nlohmann' 2025-12-04T12:53:03.5358804Z Entering 'third_party/onnx' 2025-12-04T12:53:03.5390323Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:03.5417791Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:03.5443229Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:03.5471145Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:03.5507906Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:03.5534528Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:03.5562728Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:03.5590691Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:03.5615844Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:03.5646151Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:03.5688341Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:03.5723631Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:03.5770500Z Entering 'third_party/pocketfft' 2025-12-04T12:53:03.5799877Z Entering 'third_party/protobuf' 2025-12-04T12:53:03.5830788Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:03.5858581Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:03.5887696Z Entering 'third_party/psimd' 2025-12-04T12:53:03.5921482Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:03.5946895Z Entering 'third_party/pybind11' 2025-12-04T12:53:03.5970742Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:03.5992791Z Entering 'third_party/sleef' 2025-12-04T12:53:03.6018095Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:03.6048141Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:03.6078838Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:03.6107110Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:03.6128525Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:03.6153817Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:03.6204937Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T12:53:03.6225887Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T12:53:03.6397760Z Entering 'android/libs/fbjni' 2025-12-04T12:53:03.6419360Z Entering 'third_party/FP16' 2025-12-04T12:53:03.6444026Z Entering 'third_party/FXdiv' 2025-12-04T12:53:03.6469568Z Entering 'third_party/NNPACK' 2025-12-04T12:53:03.6495227Z Entering 'third_party/NVTX' 2025-12-04T12:53:03.6517158Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:03.6537984Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:03.6564556Z Entering 'third_party/aiter' 2025-12-04T12:53:03.6589248Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:03.6618437Z Entering 'third_party/benchmark' 2025-12-04T12:53:03.6643876Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:03.6676222Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:03.6701117Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:03.6723740Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:03.6744966Z Entering 'third_party/cutlass' 2025-12-04T12:53:03.6773196Z Entering 'third_party/fbgemm' 2025-12-04T12:53:03.6801951Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:03.6834903Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:03.6869725Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:03.6892064Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:03.6916825Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:03.6943504Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:03.6965337Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:03.6992565Z Entering 'third_party/flash-attention' 2025-12-04T12:53:03.7016731Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:03.7054013Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:03.7085578Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:03.7108033Z Entering 'third_party/fmt' 2025-12-04T12:53:03.7135177Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:03.7161466Z Entering 'third_party/gloo' 2025-12-04T12:53:03.7184751Z Entering 'third_party/googletest' 2025-12-04T12:53:03.7208108Z Entering 'third_party/ideep' 2025-12-04T12:53:03.7234278Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:03.7270994Z Entering 'third_party/ittapi' 2025-12-04T12:53:03.7296252Z Entering 'third_party/kineto' 2025-12-04T12:53:03.7322191Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:03.7347972Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:03.7372180Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:03.7395453Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:03.7417706Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:03.7439774Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:03.7480344Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:03.7510494Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:03.7532857Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:03.7559587Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:03.7581863Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:03.7603338Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:03.7631016Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:03.7657160Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:03.7679267Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:03.7708640Z Entering 'third_party/kleidiai' 2025-12-04T12:53:03.7741745Z Entering 'third_party/mimalloc' 2025-12-04T12:53:03.7764239Z Entering 'third_party/nlohmann' 2025-12-04T12:53:03.7788857Z Entering 'third_party/onnx' 2025-12-04T12:53:03.7819662Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:03.7849126Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:03.7872289Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:03.7895883Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:03.7921260Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:03.7942936Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:03.7965672Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:03.7988776Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:03.8015159Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:03.8037131Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:03.8059460Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:03.8097561Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:03.8137302Z Entering 'third_party/pocketfft' 2025-12-04T12:53:03.8176124Z Entering 'third_party/protobuf' 2025-12-04T12:53:03.8205461Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:03.8231035Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:03.8256730Z Entering 'third_party/psimd' 2025-12-04T12:53:03.8281593Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:03.8304561Z Entering 'third_party/pybind11' 2025-12-04T12:53:03.8326879Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:03.8348020Z Entering 'third_party/sleef' 2025-12-04T12:53:03.8375785Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:03.8400763Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:03.8426393Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:03.8446601Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:03.8468023Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:03.8497171Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:03.8546094Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:03.8565096Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T12:53:03.8776036Z Entering 'android/libs/fbjni' 2025-12-04T12:53:03.8795165Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T12:53:03.8804655Z Entering 'third_party/FP16' 2025-12-04T12:53:03.8821240Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T12:53:03.8832618Z Entering 'third_party/FXdiv' 2025-12-04T12:53:03.8849056Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T12:53:03.8861363Z Entering 'third_party/NNPACK' 2025-12-04T12:53:03.8873069Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T12:53:03.8887941Z Entering 'third_party/NVTX' 2025-12-04T12:53:03.8902221Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T12:53:03.8912495Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:03.8926818Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T12:53:03.8938492Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:03.8948895Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T12:53:03.8964302Z Entering 'third_party/aiter' 2025-12-04T12:53:03.8973528Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T12:53:03.8984422Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:03.8994371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T12:53:03.9008372Z Entering 'third_party/benchmark' 2025-12-04T12:53:03.9020423Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:03.9034805Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:03.9052639Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T12:53:03.9064836Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:03.9075034Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T12:53:03.9089707Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:03.9099084Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T12:53:03.9111485Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:03.9121000Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T12:53:03.9129509Z Entering 'third_party/cutlass' 2025-12-04T12:53:03.9139151Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T12:53:03.9152621Z Entering 'third_party/fbgemm' 2025-12-04T12:53:03.9162120Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T12:53:03.9175948Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:03.9185648Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T12:53:03.9194594Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:03.9205846Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T12:53:03.9218684Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:03.9235343Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T12:53:03.9251515Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:03.9266439Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T12:53:03.9287134Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:03.9300608Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T12:53:03.9314809Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:03.9332708Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T12:53:03.9344257Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:03.9356512Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T12:53:03.9368637Z Entering 'third_party/flash-attention' 2025-12-04T12:53:03.9381707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T12:53:03.9390495Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:03.9402578Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T12:53:03.9414658Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:03.9428258Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T12:53:03.9441047Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:03.9451197Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T12:53:03.9460779Z Entering 'third_party/fmt' 2025-12-04T12:53:03.9472756Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:03.9481963Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:03.9491342Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T12:53:03.9499876Z Entering 'third_party/gloo' 2025-12-04T12:53:03.9512393Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T12:53:03.9520773Z Entering 'third_party/googletest' 2025-12-04T12:53:03.9531052Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:03.9539226Z Entering 'third_party/ideep' 2025-12-04T12:53:03.9553686Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T12:53:03.9563230Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:03.9576961Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T12:53:03.9591103Z Entering 'third_party/ittapi' 2025-12-04T12:53:03.9604699Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T12:53:03.9614298Z Entering 'third_party/kineto' 2025-12-04T12:53:03.9629280Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T12:53:03.9639307Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:03.9649186Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T12:53:03.9658700Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:03.9674911Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T12:53:03.9689197Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:03.9704798Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T12:53:03.9720729Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:03.9733920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:03.9745274Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:03.9755523Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T12:53:03.9763090Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:03.9779466Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T12:53:03.9792636Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:03.9805227Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T12:53:03.9815507Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:03.9827651Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:03.9843012Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:03.9857029Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T12:53:03.9867870Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:03.9883559Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T12:53:03.9892988Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:03.9910230Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:03.9919541Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:03.9936650Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:03.9947204Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:03.9962405Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:03.9979464Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:03.9991435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T12:53:04.0001763Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:04.0015594Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T12:53:04.0033363Z Entering 'third_party/kleidiai' 2025-12-04T12:53:04.0047927Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T12:53:04.0058827Z Entering 'third_party/mimalloc' 2025-12-04T12:53:04.0074179Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T12:53:04.0083654Z Entering 'third_party/nlohmann' 2025-12-04T12:53:04.0095455Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T12:53:04.0107150Z Entering 'third_party/onnx' 2025-12-04T12:53:04.0118305Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T12:53:04.0135810Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:04.0148678Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:04.0163326Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:04.0174826Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T12:53:04.0188336Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:04.0198564Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:04.0206828Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:04.0223019Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:04.0233118Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:04.0243745Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T12:53:04.0253176Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:04.0266445Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T12:53:04.0279952Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:04.0291280Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T12:53:04.0301976Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:04.0314608Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T12:53:04.0325525Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:04.0343665Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:04.0353106Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:04.0365368Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:04.0374636Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:04.0383204Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:04.0393004Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:04.0408676Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T12:53:04.0427767Z Entering 'third_party/pocketfft' 2025-12-04T12:53:04.0437588Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T12:53:04.0446389Z Entering 'third_party/protobuf' 2025-12-04T12:53:04.0460225Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T12:53:04.0471664Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:04.0486822Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:04.0496339Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:04.0506590Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:04.0517024Z Entering 'third_party/psimd' 2025-12-04T12:53:04.0526923Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T12:53:04.0536476Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:04.0550493Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T12:53:04.0562568Z Entering 'third_party/pybind11' 2025-12-04T12:53:04.0572512Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:04.0581024Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:04.0589985Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T12:53:04.0598330Z Entering 'third_party/sleef' 2025-12-04T12:53:04.0608015Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T12:53:04.0618801Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:04.0632430Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T12:53:04.0641978Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:04.0654819Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:04.0664666Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:04.0679702Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T12:53:04.0694018Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:04.0708854Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T12:53:04.0717811Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:04.0727474Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:04.0741425Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:04.0753769Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T12:53:04.0781754Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0803930Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0826485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0864849Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0865416Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0876410Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0890872Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0905829Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0925822Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0939818Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0952984Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0970704Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.0986741Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1005537Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1023183Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1036344Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1054018Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1070515Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1085295Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1103176Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1126794Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1147059Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1172515Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1190466Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1210528Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1225886Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1245509Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1261114Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1280078Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1294139Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1310015Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1323034Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1344012Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1366068Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1383527Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1406413Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1433453Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1449927Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1464464Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1477200Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1496762Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1511032Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1532077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1556308Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1573332Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1588391Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1605937Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1620003Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1634574Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1648683Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1663423Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1681020Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1697267Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1711893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1725707Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1740768Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1757440Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1773404Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1792227Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1807972Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1824335Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1838231Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1853997Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1869157Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1882975Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1902205Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1919580Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1938159Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1955652Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1979966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.1996233Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2013606Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2029770Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2044387Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2061454Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2075191Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2096530Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2112961Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2127832Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2146299Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2165902Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:04.2184989Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T12:53:04.2207893Z ##[endgroup] 2025-12-04T12:53:04.2208064Z ##[group]Fetching the repository 2025-12-04T12:53:04.2211789Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T12:53:04.8056847Z From https://github.com/pytorch/pytorch 2025-12-04T12:53:04.8057264Z - [deleted] (none) -> ciflow/trunk/169475 2025-12-04T12:53:07.9174425Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-12-04T12:53:07.9174984Z * [new branch] 2.9.1 -> origin/2.9.1 2025-12-04T12:53:07.9175570Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-12-04T12:53:07.9176224Z * [new branch] Flamefire-patch-1 -> origin/Flamefire-patch-1 2025-12-04T12:53:07.9176817Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-12-04T12:53:07.9177391Z * [new branch] HOPrintFunc -> origin/HOPrintFunc 2025-12-04T12:53:07.9177905Z * [new branch] IvanKobzarev/stack/1 -> origin/IvanKobzarev/stack/1 2025-12-04T12:53:07.9178408Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-12-04T12:53:07.9178925Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-12-04T12:53:07.9179603Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-12-04T12:53:07.9180253Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-12-04T12:53:07.9180774Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-12-04T12:53:07.9181845Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-12-04T12:53:07.9182352Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-12-04T12:53:07.9182742Z * [new branch] VLA_exp -> origin/VLA_exp 2025-12-04T12:53:07.9183051Z * [new branch] activation_bench -> origin/activation_bench 2025-12-04T12:53:07.9183367Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-12-04T12:53:07.9183696Z * [new branch] adi/onednn_aarch64 -> origin/adi/onednn_aarch64 2025-12-04T12:53:07.9184008Z * [new branch] adi/test -> origin/adi/test 2025-12-04T12:53:07.9184302Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-12-04T12:53:07.9184619Z * [new branch] adi/test_m8g -> origin/adi/test_m8g 2025-12-04T12:53:07.9184926Z * [new branch] adi/test_onednn -> origin/adi/test_onednn 2025-12-04T12:53:07.9185254Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-12-04T12:53:07.9185595Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-12-04T12:53:07.9185916Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-12-04T12:53:07.9186386Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-12-04T12:53:07.9186742Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-12-04T12:53:07.9187099Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-12-04T12:53:07.9187478Z * [new branch] albanD-patch-1 -> origin/albanD-patch-1 2025-12-04T12:53:07.9187810Z * [new branch] also-surround-shimh -> origin/also-surround-shimh 2025-12-04T12:53:07.9188155Z * [new branch] angelayi/aot_compile -> origin/angelayi/aot_compile 2025-12-04T12:53:07.9188544Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-12-04T12:53:07.9188917Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-12-04T12:53:07.9189329Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-12-04T12:53:07.9189748Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-12-04T12:53:07.9190093Z * [new branch] angelayi/inductor_const -> origin/angelayi/inductor_const 2025-12-04T12:53:07.9190489Z * [new branch] angelayi/lstm -> origin/angelayi/lstm 2025-12-04T12:53:07.9190811Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-12-04T12:53:07.9191144Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-12-04T12:53:07.9191477Z * [new branch] angelayi/side_eff -> origin/angelayi/side_eff 2025-12-04T12:53:07.9191806Z * [new branch] angelayi/state_dict -> origin/angelayi/state_dict 2025-12-04T12:53:07.9192140Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-12-04T12:53:07.9192473Z * [new branch] angelayi/symm_mem -> origin/angelayi/symm_mem 2025-12-04T12:53:07.9192765Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-12-04T12:53:07.9193002Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-12-04T12:53:07.9193241Z * [new branch] annotate_assert -> origin/annotate_assert 2025-12-04T12:53:07.9193489Z * [new branch] annotate_fallback_kernel -> origin/annotate_fallback_kernel 2025-12-04T12:53:07.9193744Z * [new branch] annotation_deepcopy -> origin/annotation_deepcopy 2025-12-04T12:53:07.9194114Z * [new branch] annotation_dynamo -> origin/annotation_dynamo 2025-12-04T12:53:07.9194353Z * [new branch] aot_eager_stack_trace -> origin/aot_eager_stack_trace 2025-12-04T12:53:07.9194593Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-12-04T12:53:07.9194826Z * [new branch] aoti_const_device -> origin/aoti_const_device 2025-12-04T12:53:07.9195067Z * [new branch] aoti_fqn_name_interface -> origin/aoti_fqn_name_interface 2025-12-04T12:53:07.9195336Z * [new branch] aoti_package_weights_binary -> origin/aoti_package_weights_binary 2025-12-04T12:53:07.9195599Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-12-04T12:53:07.9195885Z * [new branch] arsh/feat/inductor_check_profiling -> origin/arsh/feat/inductor_check_profiling 2025-12-04T12:53:07.9196163Z * [new branch] async_tp -> origin/async_tp 2025-12-04T12:53:07.9196429Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-12-04T12:53:07.9196744Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-12-04T12:53:07.9197024Z * [new branch] atalman-patch-2 -> origin/atalman-patch-2 2025-12-04T12:53:07.9197304Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-12-04T12:53:07.9197533Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-12-04T12:53:07.9197763Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-12-04T12:53:07.9197995Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-12-04T12:53:07.9198225Z * [new branch] atalman-patch-7 -> origin/atalman-patch-7 2025-12-04T12:53:07.9198455Z * [new branch] atalman-patch-8 -> origin/atalman-patch-8 2025-12-04T12:53:07.9198699Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-12-04T12:53:07.9198987Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-12-04T12:53:07.9199247Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-12-04T12:53:07.9199535Z * [new branch] attention_benchmarking_clean -> origin/attention_benchmarking_clean 2025-12-04T12:53:07.9199818Z * [new branch] bahuang/dt_fix_scalar_add -> origin/bahuang/dt_fix_scalar_add 2025-12-04T12:53:07.9200085Z * [new branch] bahuang/fix_debug_mode -> origin/bahuang/fix_debug_mode 2025-12-04T12:53:07.9200401Z * [new branch] bahuang/fix_expand -> origin/bahuang/fix_expand 2025-12-04T12:53:07.9200639Z * [new branch] bahuang/test -> origin/bahuang/test 2025-12-04T12:53:07.9200860Z * [new branch] base/1.5 -> origin/base/1.5 2025-12-04T12:53:07.9201128Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-12-04T12:53:07.9201415Z * [new branch] bench_scaled_mm_ops -> origin/bench_scaled_mm_ops 2025-12-04T12:53:07.9201667Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-12-04T12:53:07.9201926Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-12-04T12:53:07.9202182Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-12-04T12:53:07.9202420Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-12-04T12:53:07.9202658Z * [new branch] bf/bug-static-input -> origin/bf/bug-static-input 2025-12-04T12:53:07.9202901Z * [new branch] bf/cg-backend -> origin/bf/cg-backend 2025-12-04T12:53:07.9203177Z * [new branch] bf/cg-nccl-test -> origin/bf/cg-nccl-test 2025-12-04T12:53:07.9203393Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-12-04T12:53:07.9203595Z * [new branch] bf/clean-torchbench-hf -> origin/bf/clean-torchbench-hf 2025-12-04T12:53:07.9203795Z * [new branch] bf/combo-debug-log -> origin/bf/combo-debug-log 2025-12-04T12:53:07.9203982Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-12-04T12:53:07.9204222Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-12-04T12:53:07.9204591Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-12-04T12:53:07.9204908Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-12-04T12:53:07.9205127Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-12-04T12:53:07.9205339Z * [new branch] bf/dynamo-partition -> origin/bf/dynamo-partition 2025-12-04T12:53:07.9205522Z * [new branch] bf/lite -> origin/bf/lite 2025-12-04T12:53:07.9205704Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-12-04T12:53:07.9205994Z * [new branch] bf/partition-cache-free-symbols -> origin/bf/partition-cache-free-symbols 2025-12-04T12:53:07.9206245Z * [new branch] bf/partition-memory-plan -> origin/bf/partition-memory-plan 2025-12-04T12:53:07.9206468Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-12-04T12:53:07.9206692Z * [new branch] bf/partition-view-fallback -> origin/bf/partition-view-fallback 2025-12-04T12:53:07.9206916Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-12-04T12:53:07.9207131Z * [new branch] bf/timm-nov-26-2025 -> origin/bf/timm-nov-26-2025 2025-12-04T12:53:07.9207347Z * [new branch] bf/transformer-pin-4-57-3 -> origin/bf/transformer-pin-4-57-3 2025-12-04T12:53:07.9207575Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-12-04T12:53:07.9207811Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-12-04T12:53:07.9208037Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-12-04T12:53:07.9208263Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-12-04T12:53:07.9208487Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-12-04T12:53:07.9208710Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-12-04T12:53:07.9208933Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-12-04T12:53:07.9209158Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-12-04T12:53:07.9209379Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-12-04T12:53:07.9209612Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-12-04T12:53:07.9209941Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-12-04T12:53:07.9210163Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-12-04T12:53:07.9210438Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-12-04T12:53:07.9210666Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-12-04T12:53:07.9210931Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-12-04T12:53:07.9211156Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-12-04T12:53:07.9211369Z * [new branch] brister/fx_device_type -> origin/brister/fx_device_type 2025-12-04T12:53:07.9211592Z * [new branch] brister/test_inductor_all_fx -> origin/brister/test_inductor_all_fx 2025-12-04T12:53:07.9211854Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-12-04T12:53:07.9212093Z * [new branch] bwd-backup -> origin/bwd-backup 2025-12-04T12:53:07.9212265Z * [new branch] c57382a49 -> origin/c57382a49 2025-12-04T12:53:07.9212435Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-12-04T12:53:07.9212610Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-12-04T12:53:07.9212808Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-12-04T12:53:07.9213018Z * [new branch] cccclai-patch-1 -> origin/cccclai-patch-1 2025-12-04T12:53:07.9213253Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9213583Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9213858Z * [new branch] cherry-pick-162208-by-pytorch_bot_bot_ -> origin/cherry-pick-162208-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9214136Z * [new branch] cherry-pick-163169-by-pytorch_bot_bot_ -> origin/cherry-pick-163169-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9214405Z * [new branch] cherry-pick-165086-by-pytorch_bot_bot_ -> origin/cherry-pick-165086-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9214682Z * [new branch] cherry-pick-165514-by-pytorch_bot_bot_ -> origin/cherry-pick-165514-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9214962Z * [new branch] cherry-pick-165601-by-pytorch_bot_bot_ -> origin/cherry-pick-165601-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9215232Z * [new branch] cherry-pick-165667-by-pytorch_bot_bot_ -> origin/cherry-pick-165667-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9215507Z * [new branch] cherry-pick-165815-by-pytorch_bot_bot_ -> origin/cherry-pick-165815-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9215784Z * [new branch] cherry-pick-165922-by-pytorch_bot_bot_ -> origin/cherry-pick-165922-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9216056Z * [new branch] cherry-pick-166148-by-pytorch_bot_bot_ -> origin/cherry-pick-166148-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9216332Z * [new branch] cherry-pick-166181-by-pytorch_bot_bot_ -> origin/cherry-pick-166181-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9216609Z * [new branch] cherry-pick-166404-by-pytorch_bot_bot_ -> origin/cherry-pick-166404-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9216889Z * [new branch] cherry-pick-166427-by-pytorch_bot_bot_ -> origin/cherry-pick-166427-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9217159Z * [new branch] cherry-pick-166480-by-pytorch_bot_bot_ -> origin/cherry-pick-166480-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9217443Z * [new branch] cherry-pick-166570-by-pytorch_bot_bot_ -> origin/cherry-pick-166570-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9217720Z * [new branch] cherry-pick-166993-by-pytorch_bot_bot_ -> origin/cherry-pick-166993-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9217986Z * [new branch] cherry-pick-167111-by-pytorch_bot_bot_ -> origin/cherry-pick-167111-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9218252Z * [new branch] cherry-pick-167478-by-pytorch_bot_bot_ -> origin/cherry-pick-167478-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9218513Z * [new branch] cherry_pick_166036_166040 -> origin/cherry_pick_166036_166040 2025-12-04T12:53:07.9218704Z * [new branch] cherry_pick_166457 -> origin/cherry_pick_166457 2025-12-04T12:53:07.9218891Z * [new branch] cherrypick_166338 -> origin/cherrypick_166338 2025-12-04T12:53:07.9219068Z * [new branch] cherrypick_166458 -> origin/cherrypick_166458 2025-12-04T12:53:07.9219245Z * [new branch] cherrypick_166586 -> origin/cherrypick_166586 2025-12-04T12:53:07.9219416Z * [new branch] cherrypick_166956 -> origin/cherrypick_166956 2025-12-04T12:53:07.9219582Z * [new branch] ci_attn -> origin/ci_attn 2025-12-04T12:53:07.9219747Z * [new branch] codex-testing -> origin/codex-testing 2025-12-04T12:53:07.9220008Z * [new branch] codex/add-check_memory_overlap-helper-functions -> origin/codex/add-check_memory_overlap-helper-functions 2025-12-04T12:53:07.9220359Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-12-04T12:53:07.9220680Z * [new branch] codex/investigate-segfaults-in-get_tensor_storage_id -> origin/codex/investigate-segfaults-in-get_tensor_storage_id 2025-12-04T12:53:07.9221076Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-12-04T12:53:07.9221342Z * [new branch] compatiblpy39util -> origin/compatiblpy39util 2025-12-04T12:53:07.9221518Z * [new branch] cond_hop_device -> origin/cond_hop_device 2025-12-04T12:53:07.9221693Z * [new branch] context_test -> origin/context_test 2025-12-04T12:53:07.9221921Z * [new branch] copilot/code-style-cleanup-python-pip -> origin/copilot/code-style-cleanup-python-pip 2025-12-04T12:53:07.9222164Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-12-04T12:53:07.9222381Z * [new branch] cpp-docs-dependency-upgrade -> origin/cpp-docs-dependency-upgrade 2025-12-04T12:53:07.9222636Z * [new branch] crpa/typo-in-inductor_comm_lowering -> origin/crpa/typo-in-inductor_comm_lowering 2025-12-04T12:53:07.9222865Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-12-04T12:53:07.9223061Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-12-04T12:53:07.9223269Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-12-04T12:53:07.9223455Z * [new branch] csl/clean_up -> origin/csl/clean_up 2025-12-04T12:53:07.9223646Z * [new branch] csl/fix_retry_segfault_exit -> origin/csl/fix_retry_segfault_exit 2025-12-04T12:53:07.9223832Z * [new branch] csl/katex -> origin/csl/katex 2025-12-04T12:53:07.9224001Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-12-04T12:53:07.9224172Z * [new branch] csl/lint_testing -> origin/csl/lint_testing 2025-12-04T12:53:07.9224344Z * [new branch] csl/lint_thing -> origin/csl/lint_thing 2025-12-04T12:53:07.9224527Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-12-04T12:53:07.9224714Z * [new branch] csl/manually_gen_json -> origin/csl/manually_gen_json 2025-12-04T12:53:07.9224895Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-12-04T12:53:07.9225074Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-12-04T12:53:07.9225254Z * [new branch] csl/print_timing -> origin/csl/print_timing 2025-12-04T12:53:07.9225437Z * [new branch] csl/remove_experiment -> origin/csl/remove_experiment 2025-12-04T12:53:07.9225677Z * [new branch] csl/remove_maybe_unused_var -> origin/csl/remove_maybe_unused_var 2025-12-04T12:53:07.9225910Z * [new branch] csl/remove_repo_specific_autolabel -> origin/csl/remove_repo_specific_autolabel 2025-12-04T12:53:07.9226134Z * [new branch] csl/remove_run_parallel -> origin/csl/remove_run_parallel 2025-12-04T12:53:07.9226339Z * [new branch] csl/remove_unused_vars -> origin/csl/remove_unused_vars 2025-12-04T12:53:07.9226518Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-12-04T12:53:07.9226690Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-12-04T12:53:07.9226879Z * [new branch] csl/smaller_avx_amx_runenrs -> origin/csl/smaller_avx_amx_runenrs 2025-12-04T12:53:07.9227071Z * [new branch] csl/td_job_level -> origin/csl/td_job_level 2025-12-04T12:53:07.9227282Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-12-04T12:53:07.9227524Z * [new branch] csl/test_owners_autograd_dispatch_nn -> origin/csl/test_owners_autograd_dispatch_nn 2025-12-04T12:53:07.9227773Z * [new branch] csl/test_owners_higher_confidence -> origin/csl/test_owners_higher_confidence 2025-12-04T12:53:07.9228017Z * [new branch] csl/upload_json_running -> origin/csl/upload_json_running 2025-12-04T12:53:07.9228197Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-12-04T12:53:07.9228372Z * [new branch] csl/xml_stuff -> origin/csl/xml_stuff 2025-12-04T12:53:07.9228542Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-12-04T12:53:07.9228706Z * [new branch] cuda_mempool -> origin/cuda_mempool 2025-12-04T12:53:07.9228883Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-12-04T12:53:07.9229081Z * [new branch] d4l3k/debug_plane_frtrace -> origin/d4l3k/debug_plane_frtrace 2025-12-04T12:53:07.9229267Z * [new branch] daxia6/2.8o3 -> origin/daxia6/2.8o3 2025-12-04T12:53:07.9229439Z * [new branch] debug-guard -> origin/debug-guard 2025-12-04T12:53:07.9229617Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-12-04T12:53:07.9229940Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 2025-12-04T12:53:07.9230442Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 2025-12-04T12:53:07.9230775Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-12-04T12:53:07.9231015Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-12-04T12:53:07.9231244Z * [new branch] dev/dhruva/flex_attn_opt -> origin/dev/dhruva/flex_attn_opt 2025-12-04T12:53:07.9231446Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-12-04T12:53:07.9231637Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-12-04T12:53:07.9231818Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-12-04T12:53:07.9231998Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-12-04T12:53:07.9232199Z * [new branch] dev/joona/fix_sdpa_memtest -> origin/dev/joona/fix_sdpa_memtest 2025-12-04T12:53:07.9232413Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-12-04T12:53:07.9232636Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-12-04T12:53:07.9232880Z * [new branch] dev/joona/scalar_clamp -> origin/dev/joona/scalar_clamp 2025-12-04T12:53:07.9233064Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-12-04T12:53:07.9233241Z * [new branch] dev/joona/sdpa_api -> origin/dev/joona/sdpa_api 2025-12-04T12:53:07.9233428Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-12-04T12:53:07.9233624Z * [new branch] dev/joona/ulpAssertClose -> origin/dev/joona/ulpAssertClose 2025-12-04T12:53:07.9233815Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-12-04T12:53:07.9233987Z * [new branch] disp_counter -> origin/disp_counter 2025-12-04T12:53:07.9234163Z * [new branch] divyanshk-patch-1 -> origin/divyanshk-patch-1 2025-12-04T12:53:07.9234332Z * [new branch] docs -> origin/docs 2025-12-04T12:53:07.9234501Z * [new branch] documentation -> origin/documentation 2025-12-04T12:53:07.9234680Z * [new branch] eager_model_benchmarks -> origin/eager_model_benchmarks 2025-12-04T12:53:07.9234884Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-12-04T12:53:07.9235135Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-12-04T12:53:07.9235349Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-12-04T12:53:07.9235542Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-12-04T12:53:07.9235708Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-12-04T12:53:07.9235873Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-12-04T12:53:07.9236034Z * [new branch] eqy-patch-4 -> origin/eqy-patch-4 2025-12-04T12:53:07.9236200Z * [new branch] eqy-patch-5 -> origin/eqy-patch-5 2025-12-04T12:53:07.9236364Z * [new branch] eqy-patch-6 -> origin/eqy-patch-6 2025-12-04T12:53:07.9236542Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-12-04T12:53:07.9236777Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-12-04T12:53:07.9237029Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-12-04T12:53:07.9237273Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-12-04T12:53:07.9237558Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-12-04T12:53:07.9237845Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-12-04T12:53:07.9238142Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-12-04T12:53:07.9238407Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-12-04T12:53:07.9238638Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-12-04T12:53:07.9238878Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-12-04T12:53:07.9239101Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-12-04T12:53:07.9239363Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-12-04T12:53:07.9239629Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-12-04T12:53:07.9239884Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-12-04T12:53:07.9240148Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-12-04T12:53:07.9240469Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-12-04T12:53:07.9240734Z * [new branch] exclamaforte/profiler-visualization -> origin/exclamaforte/profiler-visualization 2025-12-04T12:53:07.9240996Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-12-04T12:53:07.9241257Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-12-04T12:53:07.9241546Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-12-04T12:53:07.9241769Z * [new branch] exec -> origin/exec 2025-12-04T12:53:07.9241941Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-12-04T12:53:07.9242125Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-12-04T12:53:07.9242299Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-12-04T12:53:07.9242561Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-12-04T12:53:07.9242734Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-12-04T12:53:07.9242903Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-12-04T12:53:07.9243068Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-12-04T12:53:07.9243241Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-12-04T12:53:07.9243415Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-12-04T12:53:07.9243587Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-12-04T12:53:07.9243757Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-12-04T12:53:07.9243923Z * [new branch] export-D82250826 -> origin/export-D82250826 2025-12-04T12:53:07.9244096Z * [new branch] export-D82253817 -> origin/export-D82253817 2025-12-04T12:53:07.9244268Z * [new branch] export-D83541846 -> origin/export-D83541846 2025-12-04T12:53:07.9244436Z * [new branch] export-D83627170 -> origin/export-D83627170 2025-12-04T12:53:07.9244605Z * [new branch] export-D83766701 -> origin/export-D83766701 2025-12-04T12:53:07.9244774Z * [new branch] export-D83768878 -> origin/export-D83768878 2025-12-04T12:53:07.9244945Z * [new branch] export-D83769447 -> origin/export-D83769447 2025-12-04T12:53:07.9245120Z * [new branch] export-D84089824 -> origin/export-D84089824 2025-12-04T12:53:07.9245292Z * [new branch] export-D84213020 -> origin/export-D84213020 2025-12-04T12:53:07.9245458Z * [new branch] export-D84373821 -> origin/export-D84373821 2025-12-04T12:53:07.9245629Z * [new branch] export-D84612194 -> origin/export-D84612194 2025-12-04T12:53:07.9245799Z * [new branch] export-D84890985 -> origin/export-D84890985 2025-12-04T12:53:07.9245963Z * [new branch] export-D85122326 -> origin/export-D85122326 2025-12-04T12:53:07.9246136Z * [new branch] export-D86256198 -> origin/export-D86256198 2025-12-04T12:53:07.9246306Z * [new branch] export-D86460608 -> origin/export-D86460608 2025-12-04T12:53:07.9246471Z * [new branch] export-D86474796 -> origin/export-D86474796 2025-12-04T12:53:07.9246674Z * [new branch] export-D86712396 -> origin/export-D86712396 2025-12-04T12:53:07.9246846Z * [new branch] export-D87022129 -> origin/export-D87022129 2025-12-04T12:53:07.9247012Z * [new branch] export-D87838959 -> origin/export-D87838959 2025-12-04T12:53:07.9247180Z * [new branch] export-D88319437 -> origin/export-D88319437 2025-12-04T12:53:07.9247398Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-12-04T12:53:07.9247628Z * [new branch] ezyang-titan-october -> origin/ezyang-titan-october 2025-12-04T12:53:07.9247827Z * [new branch] ezyang-titan-october2 -> origin/ezyang-titan-october2 2025-12-04T12:53:07.9248005Z * [new branch] ezyang-war -> origin/ezyang-war 2025-12-04T12:53:07.9248198Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-12-04T12:53:07.9248394Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-12-04T12:53:07.9248576Z * [new branch] fadeputr/sequence_fbgemm -> origin/fadeputr/sequence_fbgemm 2025-12-04T12:53:07.9248766Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-12-04T12:53:07.9248937Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-12-04T12:53:07.9249123Z * [new branch] fca -> origin/fca 2025-12-04T12:53:07.9249282Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-12-04T12:53:07.9249441Z * [new branch] fca5 -> origin/fca5 2025-12-04T12:53:07.9249617Z * [new branch] feature/justknobs-cpp -> origin/feature/justknobs-cpp 2025-12-04T12:53:07.9249814Z * [new branch] feature/numa-forkserver -> origin/feature/numa-forkserver 2025-12-04T12:53:07.9250018Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-12-04T12:53:07.9250238Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-12-04T12:53:07.9250420Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-12-04T12:53:07.9250611Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-12-04T12:53:07.9250798Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-12-04T12:53:07.9250987Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-12-04T12:53:07.9251169Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-12-04T12:53:07.9251361Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-12-04T12:53:07.9251564Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-12-04T12:53:07.9251755Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-12-04T12:53:07.9251968Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-12-04T12:53:07.9252174Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-12-04T12:53:07.9252349Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-12-04T12:53:07.9252531Z * [new branch] fix_addmm_issue -> origin/fix_addmm_issue 2025-12-04T12:53:07.9252729Z * [new branch] fix_amd_missing_cluster_dims -> origin/fix_amd_missing_cluster_dims 2025-12-04T12:53:07.9252925Z * [new branch] fix_bench_bwd_pass -> origin/fix_bench_bwd_pass 2025-12-04T12:53:07.9253110Z * [new branch] fix_mem_profiler_config -> origin/fix_mem_profiler_config 2025-12-04T12:53:07.9253298Z * [new branch] fix_nvrtc_discovery -> origin/fix_nvrtc_discovery 2025-12-04T12:53:07.9253510Z * [new branch] fix_op_runner -> origin/fix_op_runner 2025-12-04T12:53:07.9253678Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-12-04T12:53:07.9253852Z * [new branch] fixes-triage -> origin/fixes-triage 2025-12-04T12:53:07.9254021Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-12-04T12:53:07.9254199Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-12-04T12:53:07.9254375Z * [new branch] flex-flash -> origin/flex-flash 2025-12-04T12:53:07.9254573Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-12-04T12:53:07.9254768Z * [new branch] flex_flash -> origin/flex_flash 2025-12-04T12:53:07.9254966Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-12-04T12:53:07.9255209Z * [new branch] fmassa/tests_comm_compute_scheduler -> origin/fmassa/tests_comm_compute_scheduler 2025-12-04T12:53:07.9255428Z * [new branch] forkserver_fix -> origin/forkserver_fix 2025-12-04T12:53:07.9255599Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-12-04T12:53:07.9255768Z * [new branch] fx_cpp -> origin/fx_cpp 2025-12-04T12:53:07.9255962Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-12-04T12:53:07.9256125Z * [new branch] galv-patch-1 -> origin/galv-patch-1 2025-12-04T12:53:07.9256355Z * [new branch] galv/cudagraphs-conditional-nodes-4 -> origin/galv/cudagraphs-conditional-nodes-4 2025-12-04T12:53:07.9256607Z * [new branch] georgehong/cmakelists-patch -> origin/georgehong/cmakelists-patch 2025-12-04T12:53:07.9256810Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-12-04T12:53:07.9256987Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-12-04T12:53:07.9257168Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-12-04T12:53:07.9257357Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-12-04T12:53:07.9257544Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-12-04T12:53:07.9257730Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-12-04T12:53:07.9257908Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-12-04T12:53:07.9258088Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-12-04T12:53:07.9258269Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-12-04T12:53:07.9258453Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-12-04T12:53:07.9258637Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-12-04T12:53:07.9258815Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-12-04T12:53:07.9259014Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-12-04T12:53:07.9259197Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-12-04T12:53:07.9259375Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-12-04T12:53:07.9259549Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-12-04T12:53:07.9259737Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-12-04T12:53:07.9259913Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-12-04T12:53:07.9260090Z * [new branch] gh/H-Huang/226/base -> origin/gh/H-Huang/226/base 2025-12-04T12:53:07.9260368Z * [new branch] gh/H-Huang/226/head -> origin/gh/H-Huang/226/head 2025-12-04T12:53:07.9260545Z * [new branch] gh/H-Huang/226/orig -> origin/gh/H-Huang/226/orig 2025-12-04T12:53:07.9260731Z * [new branch] gh/H-Huang/228/base -> origin/gh/H-Huang/228/base 2025-12-04T12:53:07.9260909Z * [new branch] gh/H-Huang/228/head -> origin/gh/H-Huang/228/head 2025-12-04T12:53:07.9261084Z * [new branch] gh/H-Huang/228/orig -> origin/gh/H-Huang/228/orig 2025-12-04T12:53:07.9261276Z * [new branch] gh/IvanKobzarev/150/base -> origin/gh/IvanKobzarev/150/base 2025-12-04T12:53:07.9261480Z * [new branch] gh/IvanKobzarev/150/head -> origin/gh/IvanKobzarev/150/head 2025-12-04T12:53:07.9261684Z * [new branch] gh/IvanKobzarev/150/orig -> origin/gh/IvanKobzarev/150/orig 2025-12-04T12:53:07.9261882Z * [new branch] gh/IvanKobzarev/157/base -> origin/gh/IvanKobzarev/157/base 2025-12-04T12:53:07.9262083Z * [new branch] gh/IvanKobzarev/157/head -> origin/gh/IvanKobzarev/157/head 2025-12-04T12:53:07.9262278Z * [new branch] gh/IvanKobzarev/157/orig -> origin/gh/IvanKobzarev/157/orig 2025-12-04T12:53:07.9262475Z * [new branch] gh/IvanKobzarev/159/base -> origin/gh/IvanKobzarev/159/base 2025-12-04T12:53:07.9262720Z * [new branch] gh/IvanKobzarev/159/head -> origin/gh/IvanKobzarev/159/head 2025-12-04T12:53:07.9262917Z * [new branch] gh/IvanKobzarev/159/orig -> origin/gh/IvanKobzarev/159/orig 2025-12-04T12:53:07.9263113Z * [new branch] gh/IvanKobzarev/162/base -> origin/gh/IvanKobzarev/162/base 2025-12-04T12:53:07.9263312Z * [new branch] gh/IvanKobzarev/162/head -> origin/gh/IvanKobzarev/162/head 2025-12-04T12:53:07.9263506Z * [new branch] gh/IvanKobzarev/162/orig -> origin/gh/IvanKobzarev/162/orig 2025-12-04T12:53:07.9263710Z * [new branch] gh/IvanKobzarev/163/base -> origin/gh/IvanKobzarev/163/base 2025-12-04T12:53:07.9263910Z * [new branch] gh/IvanKobzarev/163/head -> origin/gh/IvanKobzarev/163/head 2025-12-04T12:53:07.9264104Z * [new branch] gh/IvanKobzarev/163/orig -> origin/gh/IvanKobzarev/163/orig 2025-12-04T12:53:07.9264302Z * [new branch] gh/IvanKobzarev/166/base -> origin/gh/IvanKobzarev/166/base 2025-12-04T12:53:07.9264500Z * [new branch] gh/IvanKobzarev/166/head -> origin/gh/IvanKobzarev/166/head 2025-12-04T12:53:07.9264701Z * [new branch] gh/IvanKobzarev/166/orig -> origin/gh/IvanKobzarev/166/orig 2025-12-04T12:53:07.9264898Z * [new branch] gh/IvanKobzarev/167/base -> origin/gh/IvanKobzarev/167/base 2025-12-04T12:53:07.9265092Z * [new branch] gh/IvanKobzarev/167/head -> origin/gh/IvanKobzarev/167/head 2025-12-04T12:53:07.9265288Z * [new branch] gh/IvanKobzarev/167/orig -> origin/gh/IvanKobzarev/167/orig 2025-12-04T12:53:07.9265486Z * [new branch] gh/IvanKobzarev/168/base -> origin/gh/IvanKobzarev/168/base 2025-12-04T12:53:07.9265689Z * [new branch] gh/IvanKobzarev/168/head -> origin/gh/IvanKobzarev/168/head 2025-12-04T12:53:07.9265889Z * [new branch] gh/IvanKobzarev/168/orig -> origin/gh/IvanKobzarev/168/orig 2025-12-04T12:53:07.9266087Z * [new branch] gh/IvanKobzarev/169/base -> origin/gh/IvanKobzarev/169/base 2025-12-04T12:53:07.9266281Z * [new branch] gh/IvanKobzarev/169/head -> origin/gh/IvanKobzarev/169/head 2025-12-04T12:53:07.9266477Z * [new branch] gh/IvanKobzarev/169/orig -> origin/gh/IvanKobzarev/169/orig 2025-12-04T12:53:07.9266681Z * [new branch] gh/IvanKobzarev/170/base -> origin/gh/IvanKobzarev/170/base 2025-12-04T12:53:07.9266876Z * [new branch] gh/IvanKobzarev/170/head -> origin/gh/IvanKobzarev/170/head 2025-12-04T12:53:07.9267074Z * [new branch] gh/IvanKobzarev/170/orig -> origin/gh/IvanKobzarev/170/orig 2025-12-04T12:53:07.9267313Z * [new branch] gh/IvanKobzarev/171/base -> origin/gh/IvanKobzarev/171/base 2025-12-04T12:53:07.9267510Z * [new branch] gh/IvanKobzarev/171/head -> origin/gh/IvanKobzarev/171/head 2025-12-04T12:53:07.9267710Z * [new branch] gh/IvanKobzarev/171/orig -> origin/gh/IvanKobzarev/171/orig 2025-12-04T12:53:07.9267909Z * [new branch] gh/IvanKobzarev/172/base -> origin/gh/IvanKobzarev/172/base 2025-12-04T12:53:07.9268104Z * [new branch] gh/IvanKobzarev/172/head -> origin/gh/IvanKobzarev/172/head 2025-12-04T12:53:07.9268300Z * [new branch] gh/IvanKobzarev/172/orig -> origin/gh/IvanKobzarev/172/orig 2025-12-04T12:53:07.9268503Z * [new branch] gh/IvanKobzarev/173/base -> origin/gh/IvanKobzarev/173/base 2025-12-04T12:53:07.9268701Z * [new branch] gh/IvanKobzarev/173/head -> origin/gh/IvanKobzarev/173/head 2025-12-04T12:53:07.9268905Z * [new branch] gh/IvanKobzarev/173/orig -> origin/gh/IvanKobzarev/173/orig 2025-12-04T12:53:07.9269102Z * [new branch] gh/IvanKobzarev/174/base -> origin/gh/IvanKobzarev/174/base 2025-12-04T12:53:07.9269301Z * [new branch] gh/IvanKobzarev/174/head -> origin/gh/IvanKobzarev/174/head 2025-12-04T12:53:07.9269524Z * [new branch] gh/IvanKobzarev/174/orig -> origin/gh/IvanKobzarev/174/orig 2025-12-04T12:53:07.9269723Z * [new branch] gh/IvanKobzarev/175/base -> origin/gh/IvanKobzarev/175/base 2025-12-04T12:53:07.9269918Z * [new branch] gh/IvanKobzarev/175/head -> origin/gh/IvanKobzarev/175/head 2025-12-04T12:53:07.9270115Z * [new branch] gh/IvanKobzarev/175/orig -> origin/gh/IvanKobzarev/175/orig 2025-12-04T12:53:07.9270355Z * [new branch] gh/IvanKobzarev/176/base -> origin/gh/IvanKobzarev/176/base 2025-12-04T12:53:07.9270549Z * [new branch] gh/IvanKobzarev/176/head -> origin/gh/IvanKobzarev/176/head 2025-12-04T12:53:07.9270753Z * [new branch] gh/IvanKobzarev/176/orig -> origin/gh/IvanKobzarev/176/orig 2025-12-04T12:53:07.9270950Z * [new branch] gh/IvanKobzarev/177/base -> origin/gh/IvanKobzarev/177/base 2025-12-04T12:53:07.9271151Z * [new branch] gh/IvanKobzarev/177/head -> origin/gh/IvanKobzarev/177/head 2025-12-04T12:53:07.9271351Z * [new branch] gh/IvanKobzarev/177/orig -> origin/gh/IvanKobzarev/177/orig 2025-12-04T12:53:07.9271549Z * [new branch] gh/IvanKobzarev/178/base -> origin/gh/IvanKobzarev/178/base 2025-12-04T12:53:07.9271748Z * [new branch] gh/IvanKobzarev/178/head -> origin/gh/IvanKobzarev/178/head 2025-12-04T12:53:07.9271946Z * [new branch] gh/IvanKobzarev/178/orig -> origin/gh/IvanKobzarev/178/orig 2025-12-04T12:53:07.9272139Z * [new branch] gh/IvanKobzarev/179/base -> origin/gh/IvanKobzarev/179/base 2025-12-04T12:53:07.9272343Z * [new branch] gh/IvanKobzarev/179/head -> origin/gh/IvanKobzarev/179/head 2025-12-04T12:53:07.9272540Z * [new branch] gh/IvanKobzarev/179/orig -> origin/gh/IvanKobzarev/179/orig 2025-12-04T12:53:07.9272736Z * [new branch] gh/IvanKobzarev/180/base -> origin/gh/IvanKobzarev/180/base 2025-12-04T12:53:07.9272934Z * [new branch] gh/IvanKobzarev/180/head -> origin/gh/IvanKobzarev/180/head 2025-12-04T12:53:07.9273130Z * [new branch] gh/IvanKobzarev/180/orig -> origin/gh/IvanKobzarev/180/orig 2025-12-04T12:53:07.9273328Z * [new branch] gh/IvanKobzarev/181/base -> origin/gh/IvanKobzarev/181/base 2025-12-04T12:53:07.9273526Z * [new branch] gh/IvanKobzarev/181/head -> origin/gh/IvanKobzarev/181/head 2025-12-04T12:53:07.9273722Z * [new branch] gh/IvanKobzarev/181/orig -> origin/gh/IvanKobzarev/181/orig 2025-12-04T12:53:07.9273916Z * [new branch] gh/IvanKobzarev/182/base -> origin/gh/IvanKobzarev/182/base 2025-12-04T12:53:07.9274160Z * [new branch] gh/IvanKobzarev/182/head -> origin/gh/IvanKobzarev/182/head 2025-12-04T12:53:07.9274361Z * [new branch] gh/IvanKobzarev/182/orig -> origin/gh/IvanKobzarev/182/orig 2025-12-04T12:53:07.9274559Z * [new branch] gh/IvanKobzarev/183/base -> origin/gh/IvanKobzarev/183/base 2025-12-04T12:53:07.9274758Z * [new branch] gh/IvanKobzarev/183/head -> origin/gh/IvanKobzarev/183/head 2025-12-04T12:53:07.9274957Z * [new branch] gh/IvanKobzarev/183/orig -> origin/gh/IvanKobzarev/183/orig 2025-12-04T12:53:07.9275151Z * [new branch] gh/IvanKobzarev/184/base -> origin/gh/IvanKobzarev/184/base 2025-12-04T12:53:07.9275354Z * [new branch] gh/IvanKobzarev/184/head -> origin/gh/IvanKobzarev/184/head 2025-12-04T12:53:07.9275550Z * [new branch] gh/IvanKobzarev/184/orig -> origin/gh/IvanKobzarev/184/orig 2025-12-04T12:53:07.9275748Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-12-04T12:53:07.9275945Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-12-04T12:53:07.9276142Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-12-04T12:53:07.9276337Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-12-04T12:53:07.9276562Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-12-04T12:53:07.9276754Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-12-04T12:53:07.9276945Z * [new branch] gh/NikhilAPatel/5/base -> origin/gh/NikhilAPatel/5/base 2025-12-04T12:53:07.9277137Z * [new branch] gh/NikhilAPatel/5/head -> origin/gh/NikhilAPatel/5/head 2025-12-04T12:53:07.9277327Z * [new branch] gh/NikhilAPatel/5/orig -> origin/gh/NikhilAPatel/5/orig 2025-12-04T12:53:07.9277521Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-12-04T12:53:07.9277701Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-12-04T12:53:07.9277874Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-12-04T12:53:07.9278048Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-12-04T12:53:07.9278228Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-12-04T12:53:07.9278396Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-12-04T12:53:07.9278567Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-12-04T12:53:07.9278739Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-12-04T12:53:07.9278908Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-12-04T12:53:07.9279081Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-12-04T12:53:07.9279265Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-12-04T12:53:07.9279434Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-12-04T12:53:07.9279607Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-12-04T12:53:07.9279780Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-12-04T12:53:07.9279950Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-12-04T12:53:07.9280120Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-12-04T12:53:07.9280355Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-12-04T12:53:07.9280524Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-12-04T12:53:07.9280735Z * [new branch] gh/PaliC/25/head -> origin/gh/PaliC/25/head 2025-12-04T12:53:07.9280905Z * [new branch] gh/PaliC/25/next -> origin/gh/PaliC/25/next 2025-12-04T12:53:07.9281077Z * [new branch] gh/PaliC/25/orig -> origin/gh/PaliC/25/orig 2025-12-04T12:53:07.9281247Z * [new branch] gh/PaliC/26/head -> origin/gh/PaliC/26/head 2025-12-04T12:53:07.9281422Z * [new branch] gh/PaliC/26/next -> origin/gh/PaliC/26/next 2025-12-04T12:53:07.9281596Z * [new branch] gh/PaliC/26/orig -> origin/gh/PaliC/26/orig 2025-12-04T12:53:07.9281768Z * [new branch] gh/PaliC/27/next -> origin/gh/PaliC/27/next 2025-12-04T12:53:07.9281938Z * [new branch] gh/PaliC/28/head -> origin/gh/PaliC/28/head 2025-12-04T12:53:07.9282110Z * [new branch] gh/PaliC/28/next -> origin/gh/PaliC/28/next 2025-12-04T12:53:07.9282284Z * [new branch] gh/PaliC/28/orig -> origin/gh/PaliC/28/orig 2025-12-04T12:53:07.9282460Z * [new branch] gh/PaliC/29/head -> origin/gh/PaliC/29/head 2025-12-04T12:53:07.9282632Z * [new branch] gh/PaliC/29/next -> origin/gh/PaliC/29/next 2025-12-04T12:53:07.9282804Z * [new branch] gh/PaliC/29/orig -> origin/gh/PaliC/29/orig 2025-12-04T12:53:07.9283014Z * [new branch] gh/PaliC/30/head -> origin/gh/PaliC/30/head 2025-12-04T12:53:07.9283189Z * [new branch] gh/PaliC/30/next -> origin/gh/PaliC/30/next 2025-12-04T12:53:07.9283359Z * [new branch] gh/PaliC/30/orig -> origin/gh/PaliC/30/orig 2025-12-04T12:53:07.9283534Z * [new branch] gh/PaliC/31/head -> origin/gh/PaliC/31/head 2025-12-04T12:53:07.9283707Z * [new branch] gh/PaliC/31/next -> origin/gh/PaliC/31/next 2025-12-04T12:53:07.9283879Z * [new branch] gh/PaliC/31/orig -> origin/gh/PaliC/31/orig 2025-12-04T12:53:07.9284063Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-12-04T12:53:07.9284256Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-12-04T12:53:07.9284443Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-12-04T12:53:07.9284637Z * [new branch] gh/PaulZhang12/28/base -> origin/gh/PaulZhang12/28/base 2025-12-04T12:53:07.9284832Z * [new branch] gh/PaulZhang12/28/head -> origin/gh/PaulZhang12/28/head 2025-12-04T12:53:07.9285019Z * [new branch] gh/PaulZhang12/28/orig -> origin/gh/PaulZhang12/28/orig 2025-12-04T12:53:07.9285208Z * [new branch] gh/PaulZhang12/31/base -> origin/gh/PaulZhang12/31/base 2025-12-04T12:53:07.9285398Z * [new branch] gh/PaulZhang12/31/head -> origin/gh/PaulZhang12/31/head 2025-12-04T12:53:07.9285585Z * [new branch] gh/PaulZhang12/31/orig -> origin/gh/PaulZhang12/31/orig 2025-12-04T12:53:07.9285774Z * [new branch] gh/PaulZhang12/37/base -> origin/gh/PaulZhang12/37/base 2025-12-04T12:53:07.9285970Z * [new branch] gh/PaulZhang12/37/head -> origin/gh/PaulZhang12/37/head 2025-12-04T12:53:07.9286156Z * [new branch] gh/PaulZhang12/37/orig -> origin/gh/PaulZhang12/37/orig 2025-12-04T12:53:07.9286345Z * [new branch] gh/PaulZhang12/40/base -> origin/gh/PaulZhang12/40/base 2025-12-04T12:53:07.9286534Z * [new branch] gh/PaulZhang12/40/head -> origin/gh/PaulZhang12/40/head 2025-12-04T12:53:07.9286719Z * [new branch] gh/PaulZhang12/40/orig -> origin/gh/PaulZhang12/40/orig 2025-12-04T12:53:07.9286913Z * [new branch] gh/PaulZhang12/42/base -> origin/gh/PaulZhang12/42/base 2025-12-04T12:53:07.9287100Z * [new branch] gh/PaulZhang12/42/head -> origin/gh/PaulZhang12/42/head 2025-12-04T12:53:07.9287313Z * [new branch] gh/PaulZhang12/43/base -> origin/gh/PaulZhang12/43/base 2025-12-04T12:53:07.9287501Z * [new branch] gh/PaulZhang12/43/head -> origin/gh/PaulZhang12/43/head 2025-12-04T12:53:07.9287690Z * [new branch] gh/PaulZhang12/43/orig -> origin/gh/PaulZhang12/43/orig 2025-12-04T12:53:07.9287875Z * [new branch] gh/PaulZhang12/44/base -> origin/gh/PaulZhang12/44/base 2025-12-04T12:53:07.9288079Z * [new branch] gh/PaulZhang12/44/head -> origin/gh/PaulZhang12/44/head 2025-12-04T12:53:07.9288270Z * [new branch] gh/PaulZhang12/45/base -> origin/gh/PaulZhang12/45/base 2025-12-04T12:53:07.9288458Z * [new branch] gh/PaulZhang12/45/head -> origin/gh/PaulZhang12/45/head 2025-12-04T12:53:07.9288648Z * [new branch] gh/PaulZhang12/45/orig -> origin/gh/PaulZhang12/45/orig 2025-12-04T12:53:07.9288838Z * [new branch] gh/PaulZhang12/46/base -> origin/gh/PaulZhang12/46/base 2025-12-04T12:53:07.9289033Z * [new branch] gh/PaulZhang12/46/head -> origin/gh/PaulZhang12/46/head 2025-12-04T12:53:07.9289223Z * [new branch] gh/PaulZhang12/46/orig -> origin/gh/PaulZhang12/46/orig 2025-12-04T12:53:07.9289415Z * [new branch] gh/PaulZhang12/47/base -> origin/gh/PaulZhang12/47/base 2025-12-04T12:53:07.9289638Z * [new branch] gh/PaulZhang12/47/head -> origin/gh/PaulZhang12/47/head 2025-12-04T12:53:07.9289972Z * [new branch] gh/PaulZhang12/47/orig -> origin/gh/PaulZhang12/47/orig 2025-12-04T12:53:07.9290166Z * [new branch] gh/PaulZhang12/48/base -> origin/gh/PaulZhang12/48/base 2025-12-04T12:53:07.9290390Z * [new branch] gh/PaulZhang12/48/head -> origin/gh/PaulZhang12/48/head 2025-12-04T12:53:07.9290578Z * [new branch] gh/PaulZhang12/48/orig -> origin/gh/PaulZhang12/48/orig 2025-12-04T12:53:07.9290764Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-12-04T12:53:07.9290960Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-12-04T12:53:07.9291156Z * [new branch] gh/SherlockNoMad/1/base -> origin/gh/SherlockNoMad/1/base 2025-12-04T12:53:07.9291352Z * [new branch] gh/SherlockNoMad/1/head -> origin/gh/SherlockNoMad/1/head 2025-12-04T12:53:07.9291557Z * [new branch] gh/SherlockNoMad/10/base -> origin/gh/SherlockNoMad/10/base 2025-12-04T12:53:07.9291883Z * [new branch] gh/SherlockNoMad/10/head -> origin/gh/SherlockNoMad/10/head 2025-12-04T12:53:07.9292239Z * [new branch] gh/SherlockNoMad/10/orig -> origin/gh/SherlockNoMad/10/orig 2025-12-04T12:53:07.9292483Z * [new branch] gh/SherlockNoMad/11/base -> origin/gh/SherlockNoMad/11/base 2025-12-04T12:53:07.9292717Z * [new branch] gh/SherlockNoMad/11/head -> origin/gh/SherlockNoMad/11/head 2025-12-04T12:53:07.9293088Z * [new branch] gh/SherlockNoMad/11/orig -> origin/gh/SherlockNoMad/11/orig 2025-12-04T12:53:07.9293332Z * [new branch] gh/SherlockNoMad/12/base -> origin/gh/SherlockNoMad/12/base 2025-12-04T12:53:07.9293561Z * [new branch] gh/SherlockNoMad/12/head -> origin/gh/SherlockNoMad/12/head 2025-12-04T12:53:07.9293834Z * [new branch] gh/SherlockNoMad/12/orig -> origin/gh/SherlockNoMad/12/orig 2025-12-04T12:53:07.9294101Z * [new branch] gh/SherlockNoMad/15/base -> origin/gh/SherlockNoMad/15/base 2025-12-04T12:53:07.9294325Z * [new branch] gh/SherlockNoMad/15/head -> origin/gh/SherlockNoMad/15/head 2025-12-04T12:53:07.9294598Z * [new branch] gh/SherlockNoMad/15/orig -> origin/gh/SherlockNoMad/15/orig 2025-12-04T12:53:07.9294836Z * [new branch] gh/SherlockNoMad/17/base -> origin/gh/SherlockNoMad/17/base 2025-12-04T12:53:07.9295104Z * [new branch] gh/SherlockNoMad/17/head -> origin/gh/SherlockNoMad/17/head 2025-12-04T12:53:07.9295393Z * [new branch] gh/SherlockNoMad/17/orig -> origin/gh/SherlockNoMad/17/orig 2025-12-04T12:53:07.9295633Z * [new branch] gh/SherlockNoMad/18/base -> origin/gh/SherlockNoMad/18/base 2025-12-04T12:53:07.9295875Z * [new branch] gh/SherlockNoMad/18/head -> origin/gh/SherlockNoMad/18/head 2025-12-04T12:53:07.9296142Z * [new branch] gh/SherlockNoMad/18/orig -> origin/gh/SherlockNoMad/18/orig 2025-12-04T12:53:07.9296378Z * [new branch] gh/SherlockNoMad/19/base -> origin/gh/SherlockNoMad/19/base 2025-12-04T12:53:07.9296636Z * [new branch] gh/SherlockNoMad/19/head -> origin/gh/SherlockNoMad/19/head 2025-12-04T12:53:07.9296872Z * [new branch] gh/SherlockNoMad/19/orig -> origin/gh/SherlockNoMad/19/orig 2025-12-04T12:53:07.9297173Z * [new branch] gh/SherlockNoMad/2/base -> origin/gh/SherlockNoMad/2/base 2025-12-04T12:53:07.9297397Z * [new branch] gh/SherlockNoMad/2/head -> origin/gh/SherlockNoMad/2/head 2025-12-04T12:53:07.9297638Z * [new branch] gh/SherlockNoMad/20/base -> origin/gh/SherlockNoMad/20/base 2025-12-04T12:53:07.9297885Z * [new branch] gh/SherlockNoMad/20/head -> origin/gh/SherlockNoMad/20/head 2025-12-04T12:53:07.9325735Z * [new branch] gh/SherlockNoMad/20/orig -> origin/gh/SherlockNoMad/20/orig 2025-12-04T12:53:07.9326139Z * [new branch] gh/SherlockNoMad/21/base -> origin/gh/SherlockNoMad/21/base 2025-12-04T12:53:07.9326378Z * [new branch] gh/SherlockNoMad/21/head -> origin/gh/SherlockNoMad/21/head 2025-12-04T12:53:07.9326593Z * [new branch] gh/SherlockNoMad/21/orig -> origin/gh/SherlockNoMad/21/orig 2025-12-04T12:53:07.9326822Z * [new branch] gh/SherlockNoMad/3/base -> origin/gh/SherlockNoMad/3/base 2025-12-04T12:53:07.9327041Z * [new branch] gh/SherlockNoMad/3/head -> origin/gh/SherlockNoMad/3/head 2025-12-04T12:53:07.9327252Z * [new branch] gh/SherlockNoMad/4/base -> origin/gh/SherlockNoMad/4/base 2025-12-04T12:53:07.9327465Z * [new branch] gh/SherlockNoMad/4/head -> origin/gh/SherlockNoMad/4/head 2025-12-04T12:53:07.9327673Z * [new branch] gh/SherlockNoMad/5/base -> origin/gh/SherlockNoMad/5/base 2025-12-04T12:53:07.9327874Z * [new branch] gh/SherlockNoMad/5/head -> origin/gh/SherlockNoMad/5/head 2025-12-04T12:53:07.9328094Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-12-04T12:53:07.9328313Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-12-04T12:53:07.9328530Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-12-04T12:53:07.9328743Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-12-04T12:53:07.9328953Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-12-04T12:53:07.9329153Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-12-04T12:53:07.9329356Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-12-04T12:53:07.9329561Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-12-04T12:53:07.9329787Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-12-04T12:53:07.9329982Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-12-04T12:53:07.9330230Z * [new branch] gh/StrongerXi/73/base -> origin/gh/StrongerXi/73/base 2025-12-04T12:53:07.9330428Z * [new branch] gh/StrongerXi/73/head -> origin/gh/StrongerXi/73/head 2025-12-04T12:53:07.9330627Z * [new branch] gh/StrongerXi/73/orig -> origin/gh/StrongerXi/73/orig 2025-12-04T12:53:07.9330816Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-12-04T12:53:07.9331043Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-12-04T12:53:07.9331234Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-12-04T12:53:07.9331419Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-12-04T12:53:07.9331607Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-12-04T12:53:07.9331807Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-12-04T12:53:07.9331990Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-12-04T12:53:07.9332179Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-12-04T12:53:07.9332363Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-12-04T12:53:07.9332552Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-12-04T12:53:07.9332733Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-12-04T12:53:07.9332927Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-12-04T12:53:07.9333109Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-12-04T12:53:07.9333359Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-12-04T12:53:07.9333565Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-12-04T12:53:07.9333759Z * [new branch] gh/XilunWu/171/base -> origin/gh/XilunWu/171/base 2025-12-04T12:53:07.9333973Z * [new branch] gh/XilunWu/171/head -> origin/gh/XilunWu/171/head 2025-12-04T12:53:07.9334166Z * [new branch] gh/XilunWu/171/orig -> origin/gh/XilunWu/171/orig 2025-12-04T12:53:07.9334358Z * [new branch] gh/XilunWu/173/base -> origin/gh/XilunWu/173/base 2025-12-04T12:53:07.9334558Z * [new branch] gh/XilunWu/173/head -> origin/gh/XilunWu/173/head 2025-12-04T12:53:07.9334742Z * [new branch] gh/XilunWu/173/orig -> origin/gh/XilunWu/173/orig 2025-12-04T12:53:07.9334921Z * [new branch] gh/XilunWu/175/base -> origin/gh/XilunWu/175/base 2025-12-04T12:53:07.9335105Z * [new branch] gh/XilunWu/175/head -> origin/gh/XilunWu/175/head 2025-12-04T12:53:07.9335285Z * [new branch] gh/XilunWu/175/orig -> origin/gh/XilunWu/175/orig 2025-12-04T12:53:07.9335477Z * [new branch] gh/XilunWu/176/base -> origin/gh/XilunWu/176/base 2025-12-04T12:53:07.9335660Z * [new branch] gh/XilunWu/176/head -> origin/gh/XilunWu/176/head 2025-12-04T12:53:07.9335839Z * [new branch] gh/XilunWu/176/orig -> origin/gh/XilunWu/176/orig 2025-12-04T12:53:07.9336025Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-12-04T12:53:07.9336224Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-12-04T12:53:07.9336406Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-12-04T12:53:07.9336594Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-12-04T12:53:07.9336789Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-12-04T12:53:07.9336981Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-12-04T12:53:07.9337175Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-12-04T12:53:07.9337363Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-12-04T12:53:07.9337549Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-12-04T12:53:07.9337768Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-12-04T12:53:07.9337957Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-12-04T12:53:07.9338142Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-12-04T12:53:07.9338331Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-12-04T12:53:07.9338522Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-12-04T12:53:07.9338712Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-12-04T12:53:07.9338898Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-12-04T12:53:07.9339084Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-12-04T12:53:07.9339272Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-12-04T12:53:07.9339459Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-12-04T12:53:07.9339645Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-12-04T12:53:07.9339836Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-12-04T12:53:07.9340057Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-12-04T12:53:07.9340288Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-12-04T12:53:07.9340478Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-12-04T12:53:07.9340666Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-12-04T12:53:07.9340852Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-12-04T12:53:07.9341041Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-12-04T12:53:07.9341233Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-12-04T12:53:07.9341417Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-12-04T12:53:07.9341604Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-12-04T12:53:07.9341795Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-12-04T12:53:07.9341980Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-12-04T12:53:07.9342169Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-12-04T12:53:07.9342356Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-12-04T12:53:07.9342541Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-12-04T12:53:07.9342730Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-12-04T12:53:07.9342918Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-12-04T12:53:07.9343109Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-12-04T12:53:07.9343307Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-12-04T12:53:07.9343499Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-12-04T12:53:07.9343692Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-12-04T12:53:07.9343890Z * [new branch] gh/XuehaiPan/390/base -> origin/gh/XuehaiPan/390/base 2025-12-04T12:53:07.9344078Z * [new branch] gh/XuehaiPan/390/head -> origin/gh/XuehaiPan/390/head 2025-12-04T12:53:07.9344271Z * [new branch] gh/XuehaiPan/390/orig -> origin/gh/XuehaiPan/390/orig 2025-12-04T12:53:07.9344462Z * [new branch] gh/XuehaiPan/391/base -> origin/gh/XuehaiPan/391/base 2025-12-04T12:53:07.9344690Z * [new branch] gh/XuehaiPan/391/head -> origin/gh/XuehaiPan/391/head 2025-12-04T12:53:07.9344881Z * [new branch] gh/XuehaiPan/391/orig -> origin/gh/XuehaiPan/391/orig 2025-12-04T12:53:07.9345074Z * [new branch] gh/XuehaiPan/392/base -> origin/gh/XuehaiPan/392/base 2025-12-04T12:53:07.9345262Z * [new branch] gh/XuehaiPan/392/head -> origin/gh/XuehaiPan/392/head 2025-12-04T12:53:07.9345447Z * [new branch] gh/XuehaiPan/392/orig -> origin/gh/XuehaiPan/392/orig 2025-12-04T12:53:07.9345640Z * [new branch] gh/XuehaiPan/394/base -> origin/gh/XuehaiPan/394/base 2025-12-04T12:53:07.9345828Z * [new branch] gh/XuehaiPan/394/head -> origin/gh/XuehaiPan/394/head 2025-12-04T12:53:07.9346015Z * [new branch] gh/XuehaiPan/394/orig -> origin/gh/XuehaiPan/394/orig 2025-12-04T12:53:07.9346204Z * [new branch] gh/XuehaiPan/397/base -> origin/gh/XuehaiPan/397/base 2025-12-04T12:53:07.9346391Z * [new branch] gh/XuehaiPan/397/head -> origin/gh/XuehaiPan/397/head 2025-12-04T12:53:07.9346579Z * [new branch] gh/XuehaiPan/397/orig -> origin/gh/XuehaiPan/397/orig 2025-12-04T12:53:07.9346771Z * [new branch] gh/XuehaiPan/398/base -> origin/gh/XuehaiPan/398/base 2025-12-04T12:53:07.9346992Z * [new branch] gh/XuehaiPan/398/head -> origin/gh/XuehaiPan/398/head 2025-12-04T12:53:07.9347179Z * [new branch] gh/XuehaiPan/398/orig -> origin/gh/XuehaiPan/398/orig 2025-12-04T12:53:07.9347366Z * [new branch] gh/XuehaiPan/399/base -> origin/gh/XuehaiPan/399/base 2025-12-04T12:53:07.9347552Z * [new branch] gh/XuehaiPan/399/head -> origin/gh/XuehaiPan/399/head 2025-12-04T12:53:07.9347744Z * [new branch] gh/XuehaiPan/399/orig -> origin/gh/XuehaiPan/399/orig 2025-12-04T12:53:07.9347928Z * [new branch] gh/XuehaiPan/400/base -> origin/gh/XuehaiPan/400/base 2025-12-04T12:53:07.9348120Z * [new branch] gh/XuehaiPan/400/head -> origin/gh/XuehaiPan/400/head 2025-12-04T12:53:07.9348305Z * [new branch] gh/XuehaiPan/400/orig -> origin/gh/XuehaiPan/400/orig 2025-12-04T12:53:07.9348499Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-12-04T12:53:07.9348696Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-12-04T12:53:07.9348890Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-12-04T12:53:07.9349079Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-12-04T12:53:07.9349271Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-12-04T12:53:07.9349461Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-12-04T12:53:07.9349655Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-12-04T12:53:07.9349848Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-12-04T12:53:07.9350037Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-12-04T12:53:07.9350264Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-12-04T12:53:07.9350461Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-12-04T12:53:07.9350650Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-12-04T12:53:07.9350838Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-12-04T12:53:07.9351026Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-12-04T12:53:07.9351215Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-12-04T12:53:07.9351440Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-12-04T12:53:07.9351635Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-12-04T12:53:07.9351823Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-12-04T12:53:07.9352010Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-12-04T12:53:07.9352193Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-12-04T12:53:07.9352376Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-12-04T12:53:07.9352559Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-12-04T12:53:07.9352749Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-12-04T12:53:07.9352936Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-12-04T12:53:07.9353131Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-12-04T12:53:07.9353316Z * [new branch] gh/albanD/4/base -> origin/gh/albanD/4/base 2025-12-04T12:53:07.9353485Z * [new branch] gh/albanD/4/head -> origin/gh/albanD/4/head 2025-12-04T12:53:07.9353689Z * [new branch] gh/albanD/4/orig -> origin/gh/albanD/4/orig 2025-12-04T12:53:07.9353956Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-12-04T12:53:07.9354222Z * [new branch] gh/alexsamardzic/12/base -> origin/gh/alexsamardzic/12/base 2025-12-04T12:53:07.9354431Z * [new branch] gh/alexsamardzic/12/head -> origin/gh/alexsamardzic/12/head 2025-12-04T12:53:07.9354633Z * [new branch] gh/alexsamardzic/12/orig -> origin/gh/alexsamardzic/12/orig 2025-12-04T12:53:07.9354831Z * [new branch] gh/alexsamardzic/14/base -> origin/gh/alexsamardzic/14/base 2025-12-04T12:53:07.9355031Z * [new branch] gh/alexsamardzic/14/head -> origin/gh/alexsamardzic/14/head 2025-12-04T12:53:07.9355234Z * [new branch] gh/alexsamardzic/14/orig -> origin/gh/alexsamardzic/14/orig 2025-12-04T12:53:07.9355432Z * [new branch] gh/alexsamardzic/15/base -> origin/gh/alexsamardzic/15/base 2025-12-04T12:53:07.9355629Z * [new branch] gh/alexsamardzic/15/head -> origin/gh/alexsamardzic/15/head 2025-12-04T12:53:07.9355830Z * [new branch] gh/alexsamardzic/15/orig -> origin/gh/alexsamardzic/15/orig 2025-12-04T12:53:07.9356017Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-12-04T12:53:07.9356197Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-12-04T12:53:07.9356382Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-12-04T12:53:07.9356564Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-12-04T12:53:07.9356755Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-12-04T12:53:07.9356947Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-12-04T12:53:07.9357138Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-12-04T12:53:07.9357324Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-12-04T12:53:07.9357510Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-12-04T12:53:07.9357704Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-12-04T12:53:07.9357895Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-12-04T12:53:07.9358079Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-12-04T12:53:07.9358293Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-12-04T12:53:07.9358490Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-12-04T12:53:07.9358676Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-12-04T12:53:07.9358870Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-12-04T12:53:07.9359063Z * [new branch] gh/andyanwang/42/base -> origin/gh/andyanwang/42/base 2025-12-04T12:53:07.9359253Z * [new branch] gh/andyanwang/42/head -> origin/gh/andyanwang/42/head 2025-12-04T12:53:07.9359441Z * [new branch] gh/andyanwang/42/orig -> origin/gh/andyanwang/42/orig 2025-12-04T12:53:07.9359630Z * [new branch] gh/andyanwang/45/base -> origin/gh/andyanwang/45/base 2025-12-04T12:53:07.9359827Z * [new branch] gh/andyanwang/45/head -> origin/gh/andyanwang/45/head 2025-12-04T12:53:07.9360014Z * [new branch] gh/andyanwang/45/orig -> origin/gh/andyanwang/45/orig 2025-12-04T12:53:07.9360245Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-12-04T12:53:07.9360432Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-12-04T12:53:07.9360664Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-12-04T12:53:07.9360849Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-12-04T12:53:07.9361030Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-12-04T12:53:07.9361208Z * [new branch] gh/angelayi/116/base -> origin/gh/angelayi/116/base 2025-12-04T12:53:07.9361395Z * [new branch] gh/angelayi/116/head -> origin/gh/angelayi/116/head 2025-12-04T12:53:07.9361576Z * [new branch] gh/angelayi/116/orig -> origin/gh/angelayi/116/orig 2025-12-04T12:53:07.9361755Z * [new branch] gh/angelayi/122/base -> origin/gh/angelayi/122/base 2025-12-04T12:53:07.9361937Z * [new branch] gh/angelayi/122/head -> origin/gh/angelayi/122/head 2025-12-04T12:53:07.9362115Z * [new branch] gh/angelayi/122/orig -> origin/gh/angelayi/122/orig 2025-12-04T12:53:07.9362296Z * [new branch] gh/angelayi/124/base -> origin/gh/angelayi/124/base 2025-12-04T12:53:07.9362484Z * [new branch] gh/angelayi/124/head -> origin/gh/angelayi/124/head 2025-12-04T12:53:07.9362664Z * [new branch] gh/angelayi/124/orig -> origin/gh/angelayi/124/orig 2025-12-04T12:53:07.9362839Z * [new branch] gh/angelayi/128/base -> origin/gh/angelayi/128/base 2025-12-04T12:53:07.9363020Z * [new branch] gh/angelayi/128/head -> origin/gh/angelayi/128/head 2025-12-04T12:53:07.9363202Z * [new branch] gh/angelayi/128/orig -> origin/gh/angelayi/128/orig 2025-12-04T12:53:07.9363388Z * [new branch] gh/angelayi/131/base -> origin/gh/angelayi/131/base 2025-12-04T12:53:07.9363568Z * [new branch] gh/angelayi/131/head -> origin/gh/angelayi/131/head 2025-12-04T12:53:07.9363749Z * [new branch] gh/angelayi/131/orig -> origin/gh/angelayi/131/orig 2025-12-04T12:53:07.9363931Z * [new branch] gh/angelayi/132/base -> origin/gh/angelayi/132/base 2025-12-04T12:53:07.9364114Z * [new branch] gh/angelayi/132/head -> origin/gh/angelayi/132/head 2025-12-04T12:53:07.9364293Z * [new branch] gh/angelayi/132/orig -> origin/gh/angelayi/132/orig 2025-12-04T12:53:07.9364473Z * [new branch] gh/angelayi/133/base -> origin/gh/angelayi/133/base 2025-12-04T12:53:07.9364656Z * [new branch] gh/angelayi/133/head -> origin/gh/angelayi/133/head 2025-12-04T12:53:07.9364868Z * [new branch] gh/angelayi/133/orig -> origin/gh/angelayi/133/orig 2025-12-04T12:53:07.9365050Z * [new branch] gh/angelayi/134/base -> origin/gh/angelayi/134/base 2025-12-04T12:53:07.9365240Z * [new branch] gh/angelayi/134/head -> origin/gh/angelayi/134/head 2025-12-04T12:53:07.9365423Z * [new branch] gh/angelayi/134/orig -> origin/gh/angelayi/134/orig 2025-12-04T12:53:07.9365602Z * [new branch] gh/angelayi/135/base -> origin/gh/angelayi/135/base 2025-12-04T12:53:07.9365788Z * [new branch] gh/angelayi/135/head -> origin/gh/angelayi/135/head 2025-12-04T12:53:07.9365974Z * [new branch] gh/angelayi/135/orig -> origin/gh/angelayi/135/orig 2025-12-04T12:53:07.9366162Z * [new branch] gh/angelayi/136/base -> origin/gh/angelayi/136/base 2025-12-04T12:53:07.9366351Z * [new branch] gh/angelayi/136/head -> origin/gh/angelayi/136/head 2025-12-04T12:53:07.9366531Z * [new branch] gh/angelayi/136/orig -> origin/gh/angelayi/136/orig 2025-12-04T12:53:07.9366712Z * [new branch] gh/angelayi/137/base -> origin/gh/angelayi/137/base 2025-12-04T12:53:07.9366891Z * [new branch] gh/angelayi/137/head -> origin/gh/angelayi/137/head 2025-12-04T12:53:07.9367091Z * [new branch] gh/angelayi/137/orig -> origin/gh/angelayi/137/orig 2025-12-04T12:53:07.9367278Z * [new branch] gh/angelayi/138/base -> origin/gh/angelayi/138/base 2025-12-04T12:53:07.9367458Z * [new branch] gh/angelayi/138/head -> origin/gh/angelayi/138/head 2025-12-04T12:53:07.9367635Z * [new branch] gh/angelayi/138/orig -> origin/gh/angelayi/138/orig 2025-12-04T12:53:07.9367815Z * [new branch] gh/angelayi/139/base -> origin/gh/angelayi/139/base 2025-12-04T12:53:07.9367993Z * [new branch] gh/angelayi/139/head -> origin/gh/angelayi/139/head 2025-12-04T12:53:07.9368178Z * [new branch] gh/angelayi/139/orig -> origin/gh/angelayi/139/orig 2025-12-04T12:53:07.9368360Z * [new branch] gh/angelayi/140/base -> origin/gh/angelayi/140/base 2025-12-04T12:53:07.9368538Z * [new branch] gh/angelayi/140/head -> origin/gh/angelayi/140/head 2025-12-04T12:53:07.9368717Z * [new branch] gh/angelayi/140/orig -> origin/gh/angelayi/140/orig 2025-12-04T12:53:07.9368897Z * [new branch] gh/angelayi/141/base -> origin/gh/angelayi/141/base 2025-12-04T12:53:07.9369085Z * [new branch] gh/angelayi/141/head -> origin/gh/angelayi/141/head 2025-12-04T12:53:07.9369262Z * [new branch] gh/angelayi/141/orig -> origin/gh/angelayi/141/orig 2025-12-04T12:53:07.9369441Z * [new branch] gh/angelayi/142/base -> origin/gh/angelayi/142/base 2025-12-04T12:53:07.9369622Z * [new branch] gh/angelayi/142/head -> origin/gh/angelayi/142/head 2025-12-04T12:53:07.9369803Z * [new branch] gh/angelayi/142/orig -> origin/gh/angelayi/142/orig 2025-12-04T12:53:07.9369988Z * [new branch] gh/angelayi/143/base -> origin/gh/angelayi/143/base 2025-12-04T12:53:07.9370205Z * [new branch] gh/angelayi/143/head -> origin/gh/angelayi/143/head 2025-12-04T12:53:07.9370466Z * [new branch] gh/angelayi/143/orig -> origin/gh/angelayi/143/orig 2025-12-04T12:53:07.9370651Z * [new branch] gh/angelayi/144/base -> origin/gh/angelayi/144/base 2025-12-04T12:53:07.9370831Z * [new branch] gh/angelayi/144/head -> origin/gh/angelayi/144/head 2025-12-04T12:53:07.9371009Z * [new branch] gh/angelayi/144/orig -> origin/gh/angelayi/144/orig 2025-12-04T12:53:07.9371201Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-12-04T12:53:07.9371391Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-12-04T12:53:07.9371612Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-12-04T12:53:07.9371799Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-12-04T12:53:07.9371981Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-12-04T12:53:07.9372177Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-12-04T12:53:07.9372363Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-12-04T12:53:07.9372547Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-12-04T12:53:07.9372732Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-12-04T12:53:07.9372916Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-12-04T12:53:07.9373108Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-12-04T12:53:07.9373298Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-12-04T12:53:07.9373482Z * [new branch] gh/anijain2305/870/base -> origin/gh/anijain2305/870/base 2025-12-04T12:53:07.9373667Z * [new branch] gh/anijain2305/870/head -> origin/gh/anijain2305/870/head 2025-12-04T12:53:07.9373884Z * [new branch] gh/anijain2305/870/orig -> origin/gh/anijain2305/870/orig 2025-12-04T12:53:07.9374076Z * [new branch] gh/anijain2305/873/base -> origin/gh/anijain2305/873/base 2025-12-04T12:53:07.9374260Z * [new branch] gh/anijain2305/873/head -> origin/gh/anijain2305/873/head 2025-12-04T12:53:07.9374446Z * [new branch] gh/anijain2305/873/orig -> origin/gh/anijain2305/873/orig 2025-12-04T12:53:07.9374630Z * [new branch] gh/anijain2305/894/base -> origin/gh/anijain2305/894/base 2025-12-04T12:53:07.9374820Z * [new branch] gh/anijain2305/894/head -> origin/gh/anijain2305/894/head 2025-12-04T12:53:07.9375011Z * [new branch] gh/anijain2305/894/orig -> origin/gh/anijain2305/894/orig 2025-12-04T12:53:07.9375194Z * [new branch] gh/anijain2305/895/base -> origin/gh/anijain2305/895/base 2025-12-04T12:53:07.9375385Z * [new branch] gh/anijain2305/895/head -> origin/gh/anijain2305/895/head 2025-12-04T12:53:07.9375570Z * [new branch] gh/anijain2305/895/orig -> origin/gh/anijain2305/895/orig 2025-12-04T12:53:07.9375755Z * [new branch] gh/anijain2305/910/base -> origin/gh/anijain2305/910/base 2025-12-04T12:53:07.9375946Z * [new branch] gh/anijain2305/910/head -> origin/gh/anijain2305/910/head 2025-12-04T12:53:07.9376132Z * [new branch] gh/anijain2305/910/orig -> origin/gh/anijain2305/910/orig 2025-12-04T12:53:07.9376313Z * [new branch] gh/anijain2305/919/base -> origin/gh/anijain2305/919/base 2025-12-04T12:53:07.9376500Z * [new branch] gh/anijain2305/919/head -> origin/gh/anijain2305/919/head 2025-12-04T12:53:07.9376689Z * [new branch] gh/anijain2305/919/orig -> origin/gh/anijain2305/919/orig 2025-12-04T12:53:07.9376872Z * [new branch] gh/anijain2305/922/base -> origin/gh/anijain2305/922/base 2025-12-04T12:53:07.9377057Z * [new branch] gh/anijain2305/922/head -> origin/gh/anijain2305/922/head 2025-12-04T12:53:07.9377251Z * [new branch] gh/anijain2305/922/orig -> origin/gh/anijain2305/922/orig 2025-12-04T12:53:07.9377434Z * [new branch] gh/anijain2305/932/base -> origin/gh/anijain2305/932/base 2025-12-04T12:53:07.9377626Z * [new branch] gh/anijain2305/932/head -> origin/gh/anijain2305/932/head 2025-12-04T12:53:07.9377815Z * [new branch] gh/anijain2305/932/orig -> origin/gh/anijain2305/932/orig 2025-12-04T12:53:07.9378028Z * [new branch] gh/anijain2305/940/base -> origin/gh/anijain2305/940/base 2025-12-04T12:53:07.9378214Z * [new branch] gh/anijain2305/940/head -> origin/gh/anijain2305/940/head 2025-12-04T12:53:07.9378398Z * [new branch] gh/anijain2305/940/orig -> origin/gh/anijain2305/940/orig 2025-12-04T12:53:07.9378581Z * [new branch] gh/anijain2305/941/base -> origin/gh/anijain2305/941/base 2025-12-04T12:53:07.9378768Z * [new branch] gh/anijain2305/941/head -> origin/gh/anijain2305/941/head 2025-12-04T12:53:07.9378958Z * [new branch] gh/anijain2305/941/orig -> origin/gh/anijain2305/941/orig 2025-12-04T12:53:07.9379145Z * [new branch] gh/anijain2305/942/base -> origin/gh/anijain2305/942/base 2025-12-04T12:53:07.9379332Z * [new branch] gh/anijain2305/942/head -> origin/gh/anijain2305/942/head 2025-12-04T12:53:07.9379519Z * [new branch] gh/anijain2305/942/orig -> origin/gh/anijain2305/942/orig 2025-12-04T12:53:07.9379707Z * [new branch] gh/anijain2305/943/base -> origin/gh/anijain2305/943/base 2025-12-04T12:53:07.9379902Z * [new branch] gh/anijain2305/943/head -> origin/gh/anijain2305/943/head 2025-12-04T12:53:07.9380085Z * [new branch] gh/anijain2305/943/orig -> origin/gh/anijain2305/943/orig 2025-12-04T12:53:07.9380350Z * [new branch] gh/anijain2305/944/base -> origin/gh/anijain2305/944/base 2025-12-04T12:53:07.9380537Z * [new branch] gh/anijain2305/944/head -> origin/gh/anijain2305/944/head 2025-12-04T12:53:07.9380720Z * [new branch] gh/anijain2305/944/orig -> origin/gh/anijain2305/944/orig 2025-12-04T12:53:07.9380915Z * [new branch] gh/anijain2305/945/base -> origin/gh/anijain2305/945/base 2025-12-04T12:53:07.9381101Z * [new branch] gh/anijain2305/945/head -> origin/gh/anijain2305/945/head 2025-12-04T12:53:07.9381285Z * [new branch] gh/anijain2305/945/orig -> origin/gh/anijain2305/945/orig 2025-12-04T12:53:07.9381478Z * [new branch] gh/anijain2305/946/base -> origin/gh/anijain2305/946/base 2025-12-04T12:53:07.9381667Z * [new branch] gh/anijain2305/946/head -> origin/gh/anijain2305/946/head 2025-12-04T12:53:07.9381851Z * [new branch] gh/anijain2305/946/orig -> origin/gh/anijain2305/946/orig 2025-12-04T12:53:07.9382051Z * [new branch] gh/anijain2305/947/base -> origin/gh/anijain2305/947/base 2025-12-04T12:53:07.9382236Z * [new branch] gh/anijain2305/947/head -> origin/gh/anijain2305/947/head 2025-12-04T12:53:07.9382419Z * [new branch] gh/anijain2305/947/orig -> origin/gh/anijain2305/947/orig 2025-12-04T12:53:07.9382607Z * [new branch] gh/anijain2305/948/base -> origin/gh/anijain2305/948/base 2025-12-04T12:53:07.9382797Z * [new branch] gh/anijain2305/948/head -> origin/gh/anijain2305/948/head 2025-12-04T12:53:07.9382988Z * [new branch] gh/anijain2305/948/orig -> origin/gh/anijain2305/948/orig 2025-12-04T12:53:07.9383182Z * [new branch] gh/anijain2305/949/base -> origin/gh/anijain2305/949/base 2025-12-04T12:53:07.9383365Z * [new branch] gh/anijain2305/949/head -> origin/gh/anijain2305/949/head 2025-12-04T12:53:07.9383550Z * [new branch] gh/anijain2305/949/orig -> origin/gh/anijain2305/949/orig 2025-12-04T12:53:07.9383739Z * [new branch] gh/anijain2305/950/base -> origin/gh/anijain2305/950/base 2025-12-04T12:53:07.9383922Z * [new branch] gh/anijain2305/950/head -> origin/gh/anijain2305/950/head 2025-12-04T12:53:07.9384117Z * [new branch] gh/anijain2305/950/orig -> origin/gh/anijain2305/950/orig 2025-12-04T12:53:07.9384304Z * [new branch] gh/anijain2305/951/base -> origin/gh/anijain2305/951/base 2025-12-04T12:53:07.9384490Z * [new branch] gh/anijain2305/951/head -> origin/gh/anijain2305/951/head 2025-12-04T12:53:07.9384711Z * [new branch] gh/anijain2305/951/orig -> origin/gh/anijain2305/951/orig 2025-12-04T12:53:07.9384899Z * [new branch] gh/anijain2305/952/base -> origin/gh/anijain2305/952/base 2025-12-04T12:53:07.9385088Z * [new branch] gh/anijain2305/952/head -> origin/gh/anijain2305/952/head 2025-12-04T12:53:07.9385690Z * [new branch] gh/anijain2305/952/orig -> origin/gh/anijain2305/952/orig 2025-12-04T12:53:07.9385875Z * [new branch] gh/anijain2305/953/base -> origin/gh/anijain2305/953/base 2025-12-04T12:53:07.9386062Z * [new branch] gh/anijain2305/953/head -> origin/gh/anijain2305/953/head 2025-12-04T12:53:07.9386249Z * [new branch] gh/anijain2305/953/orig -> origin/gh/anijain2305/953/orig 2025-12-04T12:53:07.9386435Z * [new branch] gh/anijain2305/954/base -> origin/gh/anijain2305/954/base 2025-12-04T12:53:07.9386621Z * [new branch] gh/anijain2305/954/head -> origin/gh/anijain2305/954/head 2025-12-04T12:53:07.9386811Z * [new branch] gh/anijain2305/954/orig -> origin/gh/anijain2305/954/orig 2025-12-04T12:53:07.9386999Z * [new branch] gh/anijain2305/955/base -> origin/gh/anijain2305/955/base 2025-12-04T12:53:07.9387183Z * [new branch] gh/anijain2305/955/head -> origin/gh/anijain2305/955/head 2025-12-04T12:53:07.9387401Z * [new branch] gh/anijain2305/955/orig -> origin/gh/anijain2305/955/orig 2025-12-04T12:53:07.9387588Z * [new branch] gh/anijain2305/956/base -> origin/gh/anijain2305/956/base 2025-12-04T12:53:07.9387774Z * [new branch] gh/anijain2305/956/head -> origin/gh/anijain2305/956/head 2025-12-04T12:53:07.9387961Z * [new branch] gh/anijain2305/956/orig -> origin/gh/anijain2305/956/orig 2025-12-04T12:53:07.9388148Z * [new branch] gh/anijain2305/957/base -> origin/gh/anijain2305/957/base 2025-12-04T12:53:07.9388332Z * [new branch] gh/anijain2305/957/head -> origin/gh/anijain2305/957/head 2025-12-04T12:53:07.9388527Z * [new branch] gh/anijain2305/957/orig -> origin/gh/anijain2305/957/orig 2025-12-04T12:53:07.9388715Z * [new branch] gh/anijain2305/958/base -> origin/gh/anijain2305/958/base 2025-12-04T12:53:07.9388899Z * [new branch] gh/anijain2305/958/head -> origin/gh/anijain2305/958/head 2025-12-04T12:53:07.9389087Z * [new branch] gh/anijain2305/958/orig -> origin/gh/anijain2305/958/orig 2025-12-04T12:53:07.9389273Z * [new branch] gh/anijain2305/959/base -> origin/gh/anijain2305/959/base 2025-12-04T12:53:07.9389458Z * [new branch] gh/anijain2305/959/head -> origin/gh/anijain2305/959/head 2025-12-04T12:53:07.9389648Z * [new branch] gh/anijain2305/959/orig -> origin/gh/anijain2305/959/orig 2025-12-04T12:53:07.9389833Z * [new branch] gh/anijain2305/960/base -> origin/gh/anijain2305/960/base 2025-12-04T12:53:07.9390022Z * [new branch] gh/anijain2305/960/head -> origin/gh/anijain2305/960/head 2025-12-04T12:53:07.9390245Z * [new branch] gh/anijain2305/960/orig -> origin/gh/anijain2305/960/orig 2025-12-04T12:53:07.9390430Z * [new branch] gh/anijain2305/961/base -> origin/gh/anijain2305/961/base 2025-12-04T12:53:07.9390619Z * [new branch] gh/anijain2305/961/head -> origin/gh/anijain2305/961/head 2025-12-04T12:53:07.9390803Z * [new branch] gh/anijain2305/961/orig -> origin/gh/anijain2305/961/orig 2025-12-04T12:53:07.9390994Z * [new branch] gh/anijain2305/962/base -> origin/gh/anijain2305/962/base 2025-12-04T12:53:07.9391178Z * [new branch] gh/anijain2305/962/head -> origin/gh/anijain2305/962/head 2025-12-04T12:53:07.9391363Z * [new branch] gh/anijain2305/962/orig -> origin/gh/anijain2305/962/orig 2025-12-04T12:53:07.9391546Z * [new branch] gh/anijain2305/963/base -> origin/gh/anijain2305/963/base 2025-12-04T12:53:07.9391768Z * [new branch] gh/anijain2305/963/head -> origin/gh/anijain2305/963/head 2025-12-04T12:53:07.9391953Z * [new branch] gh/anijain2305/963/orig -> origin/gh/anijain2305/963/orig 2025-12-04T12:53:07.9392138Z * [new branch] gh/anijain2305/964/base -> origin/gh/anijain2305/964/base 2025-12-04T12:53:07.9392331Z * [new branch] gh/anijain2305/964/head -> origin/gh/anijain2305/964/head 2025-12-04T12:53:07.9392516Z * [new branch] gh/anijain2305/964/orig -> origin/gh/anijain2305/964/orig 2025-12-04T12:53:07.9392699Z * [new branch] gh/anijain2305/965/base -> origin/gh/anijain2305/965/base 2025-12-04T12:53:07.9392883Z * [new branch] gh/anijain2305/965/head -> origin/gh/anijain2305/965/head 2025-12-04T12:53:07.9393067Z * [new branch] gh/anijain2305/965/orig -> origin/gh/anijain2305/965/orig 2025-12-04T12:53:07.9393255Z * [new branch] gh/anijain2305/966/base -> origin/gh/anijain2305/966/base 2025-12-04T12:53:07.9393446Z * [new branch] gh/anijain2305/966/head -> origin/gh/anijain2305/966/head 2025-12-04T12:53:07.9393629Z * [new branch] gh/anijain2305/966/orig -> origin/gh/anijain2305/966/orig 2025-12-04T12:53:07.9393817Z * [new branch] gh/anijain2305/967/base -> origin/gh/anijain2305/967/base 2025-12-04T12:53:07.9394034Z * [new branch] gh/anijain2305/967/head -> origin/gh/anijain2305/967/head 2025-12-04T12:53:07.9394217Z * [new branch] gh/anijain2305/967/orig -> origin/gh/anijain2305/967/orig 2025-12-04T12:53:07.9394403Z * [new branch] gh/anijain2305/968/base -> origin/gh/anijain2305/968/base 2025-12-04T12:53:07.9394595Z * [new branch] gh/anijain2305/968/head -> origin/gh/anijain2305/968/head 2025-12-04T12:53:07.9394780Z * [new branch] gh/anijain2305/968/orig -> origin/gh/anijain2305/968/orig 2025-12-04T12:53:07.9394969Z * [new branch] gh/anijain2305/969/base -> origin/gh/anijain2305/969/base 2025-12-04T12:53:07.9395153Z * [new branch] gh/anijain2305/969/head -> origin/gh/anijain2305/969/head 2025-12-04T12:53:07.9395336Z * [new branch] gh/anijain2305/969/orig -> origin/gh/anijain2305/969/orig 2025-12-04T12:53:07.9395523Z * [new branch] gh/anijain2305/970/base -> origin/gh/anijain2305/970/base 2025-12-04T12:53:07.9395718Z * [new branch] gh/anijain2305/970/head -> origin/gh/anijain2305/970/head 2025-12-04T12:53:07.9395906Z * [new branch] gh/anijain2305/970/orig -> origin/gh/anijain2305/970/orig 2025-12-04T12:53:07.9396094Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-12-04T12:53:07.9396282Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-12-04T12:53:07.9396463Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-12-04T12:53:07.9396648Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-12-04T12:53:07.9396839Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-12-04T12:53:07.9397015Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-12-04T12:53:07.9397195Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-12-04T12:53:07.9397373Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-12-04T12:53:07.9397548Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-12-04T12:53:07.9397725Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-12-04T12:53:07.9397901Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-12-04T12:53:07.9398083Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-12-04T12:53:07.9398292Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-12-04T12:53:07.9398473Z * [new branch] gh/anshul-si/53/base -> origin/gh/anshul-si/53/base 2025-12-04T12:53:07.9398662Z * [new branch] gh/anshul-si/53/head -> origin/gh/anshul-si/53/head 2025-12-04T12:53:07.9398842Z * [new branch] gh/anshul-si/58/base -> origin/gh/anshul-si/58/base 2025-12-04T12:53:07.9399019Z * [new branch] gh/anshul-si/58/head -> origin/gh/anshul-si/58/head 2025-12-04T12:53:07.9399203Z * [new branch] gh/anshul-si/66/base -> origin/gh/anshul-si/66/base 2025-12-04T12:53:07.9399384Z * [new branch] gh/anshul-si/66/head -> origin/gh/anshul-si/66/head 2025-12-04T12:53:07.9399561Z * [new branch] gh/anshul-si/66/orig -> origin/gh/anshul-si/66/orig 2025-12-04T12:53:07.9399739Z * [new branch] gh/anshul-si/67/base -> origin/gh/anshul-si/67/base 2025-12-04T12:53:07.9399920Z * [new branch] gh/anshul-si/67/head -> origin/gh/anshul-si/67/head 2025-12-04T12:53:07.9400101Z * [new branch] gh/anshul-si/67/orig -> origin/gh/anshul-si/67/orig 2025-12-04T12:53:07.9400324Z * [new branch] gh/anshul-si/68/base -> origin/gh/anshul-si/68/base 2025-12-04T12:53:07.9400546Z * [new branch] gh/anshul-si/68/head -> origin/gh/anshul-si/68/head 2025-12-04T12:53:07.9400724Z * [new branch] gh/anshul-si/68/orig -> origin/gh/anshul-si/68/orig 2025-12-04T12:53:07.9400904Z * [new branch] gh/anshul-si/69/base -> origin/gh/anshul-si/69/base 2025-12-04T12:53:07.9401088Z * [new branch] gh/anshul-si/69/head -> origin/gh/anshul-si/69/head 2025-12-04T12:53:07.9401266Z * [new branch] gh/anshul-si/69/orig -> origin/gh/anshul-si/69/orig 2025-12-04T12:53:07.9401446Z * [new branch] gh/anshul-si/70/base -> origin/gh/anshul-si/70/base 2025-12-04T12:53:07.9401624Z * [new branch] gh/anshul-si/70/head -> origin/gh/anshul-si/70/head 2025-12-04T12:53:07.9401804Z * [new branch] gh/anshul-si/70/orig -> origin/gh/anshul-si/70/orig 2025-12-04T12:53:07.9401983Z * [new branch] gh/anshul-si/71/base -> origin/gh/anshul-si/71/base 2025-12-04T12:53:07.9402168Z * [new branch] gh/anshul-si/71/head -> origin/gh/anshul-si/71/head 2025-12-04T12:53:07.9402348Z * [new branch] gh/anshul-si/71/orig -> origin/gh/anshul-si/71/orig 2025-12-04T12:53:07.9402527Z * [new branch] gh/anshul-si/72/base -> origin/gh/anshul-si/72/base 2025-12-04T12:53:07.9402704Z * [new branch] gh/anshul-si/72/head -> origin/gh/anshul-si/72/head 2025-12-04T12:53:07.9402881Z * [new branch] gh/anshul-si/72/orig -> origin/gh/anshul-si/72/orig 2025-12-04T12:53:07.9403064Z * [new branch] gh/anshul-si/73/base -> origin/gh/anshul-si/73/base 2025-12-04T12:53:07.9403247Z * [new branch] gh/anshul-si/73/head -> origin/gh/anshul-si/73/head 2025-12-04T12:53:07.9403426Z * [new branch] gh/anshul-si/73/orig -> origin/gh/anshul-si/73/orig 2025-12-04T12:53:07.9403610Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-12-04T12:53:07.9403796Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-12-04T12:53:07.9403987Z * [new branch] gh/aorenste/134/base -> origin/gh/aorenste/134/base 2025-12-04T12:53:07.9404171Z * [new branch] gh/aorenste/134/head -> origin/gh/aorenste/134/head 2025-12-04T12:53:07.9404351Z * [new branch] gh/aorenste/134/orig -> origin/gh/aorenste/134/orig 2025-12-04T12:53:07.9404535Z * [new branch] gh/aorenste/139/base -> origin/gh/aorenste/139/base 2025-12-04T12:53:07.9404767Z * [new branch] gh/aorenste/139/head -> origin/gh/aorenste/139/head 2025-12-04T12:53:07.9404946Z * [new branch] gh/aorenste/139/orig -> origin/gh/aorenste/139/orig 2025-12-04T12:53:07.9405134Z * [new branch] gh/aorenste/141/base -> origin/gh/aorenste/141/base 2025-12-04T12:53:07.9405316Z * [new branch] gh/aorenste/141/head -> origin/gh/aorenste/141/head 2025-12-04T12:53:07.9405496Z * [new branch] gh/aorenste/145/base -> origin/gh/aorenste/145/base 2025-12-04T12:53:07.9405678Z * [new branch] gh/aorenste/145/head -> origin/gh/aorenste/145/head 2025-12-04T12:53:07.9405858Z * [new branch] gh/aorenste/145/orig -> origin/gh/aorenste/145/orig 2025-12-04T12:53:07.9406044Z * [new branch] gh/aorenste/146/base -> origin/gh/aorenste/146/base 2025-12-04T12:53:07.9406226Z * [new branch] gh/aorenste/146/head -> origin/gh/aorenste/146/head 2025-12-04T12:53:07.9406408Z * [new branch] gh/aorenste/146/orig -> origin/gh/aorenste/146/orig 2025-12-04T12:53:07.9406588Z * [new branch] gh/aorenste/147/base -> origin/gh/aorenste/147/base 2025-12-04T12:53:07.9406768Z * [new branch] gh/aorenste/147/head -> origin/gh/aorenste/147/head 2025-12-04T12:53:07.9406947Z * [new branch] gh/aorenste/147/orig -> origin/gh/aorenste/147/orig 2025-12-04T12:53:07.9407164Z * [new branch] gh/aorenste/148/base -> origin/gh/aorenste/148/base 2025-12-04T12:53:07.9407346Z * [new branch] gh/aorenste/148/head -> origin/gh/aorenste/148/head 2025-12-04T12:53:07.9407522Z * [new branch] gh/aorenste/148/orig -> origin/gh/aorenste/148/orig 2025-12-04T12:53:07.9407704Z * [new branch] gh/aorenste/149/base -> origin/gh/aorenste/149/base 2025-12-04T12:53:07.9407894Z * [new branch] gh/aorenste/149/head -> origin/gh/aorenste/149/head 2025-12-04T12:53:07.9408075Z * [new branch] gh/aorenste/149/orig -> origin/gh/aorenste/149/orig 2025-12-04T12:53:07.9408254Z * [new branch] gh/aorenste/150/base -> origin/gh/aorenste/150/base 2025-12-04T12:53:07.9408436Z * [new branch] gh/aorenste/150/head -> origin/gh/aorenste/150/head 2025-12-04T12:53:07.9408618Z * [new branch] gh/aorenste/150/orig -> origin/gh/aorenste/150/orig 2025-12-04T12:53:07.9408805Z * [new branch] gh/aorenste/151/base -> origin/gh/aorenste/151/base 2025-12-04T12:53:07.9408993Z * [new branch] gh/aorenste/151/head -> origin/gh/aorenste/151/head 2025-12-04T12:53:07.9409171Z * [new branch] gh/aorenste/151/orig -> origin/gh/aorenste/151/orig 2025-12-04T12:53:07.9409352Z * [new branch] gh/aorenste/152/base -> origin/gh/aorenste/152/base 2025-12-04T12:53:07.9409533Z * [new branch] gh/aorenste/152/head -> origin/gh/aorenste/152/head 2025-12-04T12:53:07.9409717Z * [new branch] gh/aorenste/152/orig -> origin/gh/aorenste/152/orig 2025-12-04T12:53:07.9409901Z * [new branch] gh/aorenste/153/base -> origin/gh/aorenste/153/base 2025-12-04T12:53:07.9410080Z * [new branch] gh/aorenste/153/head -> origin/gh/aorenste/153/head 2025-12-04T12:53:07.9410322Z * [new branch] gh/aorenste/153/orig -> origin/gh/aorenste/153/orig 2025-12-04T12:53:07.9410504Z * [new branch] gh/aorenste/154/base -> origin/gh/aorenste/154/base 2025-12-04T12:53:07.9410687Z * [new branch] gh/aorenste/154/head -> origin/gh/aorenste/154/head 2025-12-04T12:53:07.9410870Z * [new branch] gh/aorenste/154/orig -> origin/gh/aorenste/154/orig 2025-12-04T12:53:07.9411051Z * [new branch] gh/aorenste/155/base -> origin/gh/aorenste/155/base 2025-12-04T12:53:07.9411229Z * [new branch] gh/aorenste/155/head -> origin/gh/aorenste/155/head 2025-12-04T12:53:07.9411470Z * [new branch] gh/aorenste/155/orig -> origin/gh/aorenste/155/orig 2025-12-04T12:53:07.9411649Z * [new branch] gh/aorenste/156/base -> origin/gh/aorenste/156/base 2025-12-04T12:53:07.9411832Z * [new branch] gh/aorenste/156/head -> origin/gh/aorenste/156/head 2025-12-04T12:53:07.9412019Z * [new branch] gh/aorenste/156/orig -> origin/gh/aorenste/156/orig 2025-12-04T12:53:07.9412207Z * [new branch] gh/aorenste/157/base -> origin/gh/aorenste/157/base 2025-12-04T12:53:07.9412388Z * [new branch] gh/aorenste/157/head -> origin/gh/aorenste/157/head 2025-12-04T12:53:07.9412571Z * [new branch] gh/aorenste/157/orig -> origin/gh/aorenste/157/orig 2025-12-04T12:53:07.9412756Z * [new branch] gh/aorenste/158/base -> origin/gh/aorenste/158/base 2025-12-04T12:53:07.9412938Z * [new branch] gh/aorenste/158/head -> origin/gh/aorenste/158/head 2025-12-04T12:53:07.9413122Z * [new branch] gh/aorenste/158/orig -> origin/gh/aorenste/158/orig 2025-12-04T12:53:07.9413301Z * [new branch] gh/aorenste/159/base -> origin/gh/aorenste/159/base 2025-12-04T12:53:07.9413480Z * [new branch] gh/aorenste/159/head -> origin/gh/aorenste/159/head 2025-12-04T12:53:07.9413711Z * [new branch] gh/aorenste/159/orig -> origin/gh/aorenste/159/orig 2025-12-04T12:53:07.9413905Z * [new branch] gh/avikchaudhuri/1/base -> origin/gh/avikchaudhuri/1/base 2025-12-04T12:53:07.9414105Z * [new branch] gh/avikchaudhuri/1/head -> origin/gh/avikchaudhuri/1/head 2025-12-04T12:53:07.9414299Z * [new branch] gh/avikchaudhuri/2/base -> origin/gh/avikchaudhuri/2/base 2025-12-04T12:53:07.9414489Z * [new branch] gh/avikchaudhuri/2/head -> origin/gh/avikchaudhuri/2/head 2025-12-04T12:53:07.9414681Z * [new branch] gh/avikchaudhuri/2/orig -> origin/gh/avikchaudhuri/2/orig 2025-12-04T12:53:07.9414871Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-12-04T12:53:07.9415055Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-12-04T12:53:07.9415237Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-12-04T12:53:07.9415419Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-12-04T12:53:07.9415599Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-12-04T12:53:07.9415782Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-12-04T12:53:07.9415960Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-12-04T12:53:07.9416138Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-12-04T12:53:07.9416321Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-12-04T12:53:07.9416508Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-12-04T12:53:07.9416683Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-12-04T12:53:07.9416864Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-12-04T12:53:07.9417042Z * [new branch] gh/bdhirsh/672/base -> origin/gh/bdhirsh/672/base 2025-12-04T12:53:07.9417222Z * [new branch] gh/bdhirsh/672/head -> origin/gh/bdhirsh/672/head 2025-12-04T12:53:07.9417404Z * [new branch] gh/bdhirsh/672/orig -> origin/gh/bdhirsh/672/orig 2025-12-04T12:53:07.9417582Z * [new branch] gh/bdhirsh/675/base -> origin/gh/bdhirsh/675/base 2025-12-04T12:53:07.9417758Z * [new branch] gh/bdhirsh/675/head -> origin/gh/bdhirsh/675/head 2025-12-04T12:53:07.9417939Z * [new branch] gh/bdhirsh/675/orig -> origin/gh/bdhirsh/675/orig 2025-12-04T12:53:07.9418156Z * [new branch] gh/bdhirsh/676/base -> origin/gh/bdhirsh/676/base 2025-12-04T12:53:07.9418335Z * [new branch] gh/bdhirsh/676/head -> origin/gh/bdhirsh/676/head 2025-12-04T12:53:07.9418514Z * [new branch] gh/bdhirsh/676/orig -> origin/gh/bdhirsh/676/orig 2025-12-04T12:53:07.9418690Z * [new branch] gh/bdhirsh/677/base -> origin/gh/bdhirsh/677/base 2025-12-04T12:53:07.9418761Z * [new branch] gh/bdhirsh/677/head -> origin/gh/bdhirsh/677/head 2025-12-04T12:53:07.9418833Z * [new branch] gh/bdhirsh/677/orig -> origin/gh/bdhirsh/677/orig 2025-12-04T12:53:07.9418905Z * [new branch] gh/bdhirsh/678/base -> origin/gh/bdhirsh/678/base 2025-12-04T12:53:07.9418978Z * [new branch] gh/bdhirsh/678/head -> origin/gh/bdhirsh/678/head 2025-12-04T12:53:07.9419046Z * [new branch] gh/bdhirsh/678/orig -> origin/gh/bdhirsh/678/orig 2025-12-04T12:53:07.9419122Z * [new branch] gh/bdhirsh/679/base -> origin/gh/bdhirsh/679/base 2025-12-04T12:53:07.9419192Z * [new branch] gh/bdhirsh/679/head -> origin/gh/bdhirsh/679/head 2025-12-04T12:53:07.9419260Z * [new branch] gh/bdhirsh/679/orig -> origin/gh/bdhirsh/679/orig 2025-12-04T12:53:07.9419362Z * [new branch] gh/bdhirsh/680/base -> origin/gh/bdhirsh/680/base 2025-12-04T12:53:07.9419431Z * [new branch] gh/bdhirsh/680/head -> origin/gh/bdhirsh/680/head 2025-12-04T12:53:07.9419499Z * [new branch] gh/bdhirsh/680/orig -> origin/gh/bdhirsh/680/orig 2025-12-04T12:53:07.9419572Z * [new branch] gh/bdhirsh/681/base -> origin/gh/bdhirsh/681/base 2025-12-04T12:53:07.9419642Z * [new branch] gh/bdhirsh/681/head -> origin/gh/bdhirsh/681/head 2025-12-04T12:53:07.9419711Z * [new branch] gh/bdhirsh/681/orig -> origin/gh/bdhirsh/681/orig 2025-12-04T12:53:07.9419812Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-12-04T12:53:07.9419901Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-12-04T12:53:07.9419986Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-12-04T12:53:07.9420078Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-12-04T12:53:07.9420163Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-12-04T12:53:07.9420294Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-12-04T12:53:07.9420380Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-12-04T12:53:07.9420466Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-12-04T12:53:07.9420555Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-12-04T12:53:07.9420644Z * [new branch] gh/benjaminglass1/107/base -> origin/gh/benjaminglass1/107/base 2025-12-04T12:53:07.9420728Z * [new branch] gh/benjaminglass1/107/head -> origin/gh/benjaminglass1/107/head 2025-12-04T12:53:07.9420816Z * [new branch] gh/benjaminglass1/107/orig -> origin/gh/benjaminglass1/107/orig 2025-12-04T12:53:07.9420901Z * [new branch] gh/benjaminglass1/108/base -> origin/gh/benjaminglass1/108/base 2025-12-04T12:53:07.9420985Z * [new branch] gh/benjaminglass1/108/head -> origin/gh/benjaminglass1/108/head 2025-12-04T12:53:07.9421072Z * [new branch] gh/benjaminglass1/108/orig -> origin/gh/benjaminglass1/108/orig 2025-12-04T12:53:07.9421158Z * [new branch] gh/benjaminglass1/109/base -> origin/gh/benjaminglass1/109/base 2025-12-04T12:53:07.9421297Z * [new branch] gh/benjaminglass1/109/head -> origin/gh/benjaminglass1/109/head 2025-12-04T12:53:07.9421384Z * [new branch] gh/benjaminglass1/109/orig -> origin/gh/benjaminglass1/109/orig 2025-12-04T12:53:07.9421468Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-12-04T12:53:07.9421553Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-12-04T12:53:07.9421643Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-12-04T12:53:07.9421721Z * [new branch] gh/bobrenjc93/570/base -> origin/gh/bobrenjc93/570/base 2025-12-04T12:53:07.9421798Z * [new branch] gh/bobrenjc93/570/head -> origin/gh/bobrenjc93/570/head 2025-12-04T12:53:07.9421874Z * [new branch] gh/bobrenjc93/570/orig -> origin/gh/bobrenjc93/570/orig 2025-12-04T12:53:07.9421948Z * [new branch] gh/bobrenjc93/604/base -> origin/gh/bobrenjc93/604/base 2025-12-04T12:53:07.9422022Z * [new branch] gh/bobrenjc93/604/head -> origin/gh/bobrenjc93/604/head 2025-12-04T12:53:07.9422097Z * [new branch] gh/bobrenjc93/604/orig -> origin/gh/bobrenjc93/604/orig 2025-12-04T12:53:07.9422169Z * [new branch] gh/bobrenjc93/638/base -> origin/gh/bobrenjc93/638/base 2025-12-04T12:53:07.9422286Z * [new branch] gh/bobrenjc93/638/head -> origin/gh/bobrenjc93/638/head 2025-12-04T12:53:07.9422362Z * [new branch] gh/bobrenjc93/638/orig -> origin/gh/bobrenjc93/638/orig 2025-12-04T12:53:07.9422437Z * [new branch] gh/bobrenjc93/653/base -> origin/gh/bobrenjc93/653/base 2025-12-04T12:53:07.9422515Z * [new branch] gh/bobrenjc93/653/head -> origin/gh/bobrenjc93/653/head 2025-12-04T12:53:07.9422591Z * [new branch] gh/bobrenjc93/653/orig -> origin/gh/bobrenjc93/653/orig 2025-12-04T12:53:07.9422666Z * [new branch] gh/bobrenjc93/654/base -> origin/gh/bobrenjc93/654/base 2025-12-04T12:53:07.9422743Z * [new branch] gh/bobrenjc93/654/head -> origin/gh/bobrenjc93/654/head 2025-12-04T12:53:07.9422815Z * [new branch] gh/bobrenjc93/654/orig -> origin/gh/bobrenjc93/654/orig 2025-12-04T12:53:07.9422887Z * [new branch] gh/bobrenjc93/657/base -> origin/gh/bobrenjc93/657/base 2025-12-04T12:53:07.9422962Z * [new branch] gh/bobrenjc93/657/head -> origin/gh/bobrenjc93/657/head 2025-12-04T12:53:07.9423034Z * [new branch] gh/bobrenjc93/657/orig -> origin/gh/bobrenjc93/657/orig 2025-12-04T12:53:07.9423108Z * [new branch] gh/bobrenjc93/672/base -> origin/gh/bobrenjc93/672/base 2025-12-04T12:53:07.9423181Z * [new branch] gh/bobrenjc93/672/head -> origin/gh/bobrenjc93/672/head 2025-12-04T12:53:07.9423253Z * [new branch] gh/bobrenjc93/672/orig -> origin/gh/bobrenjc93/672/orig 2025-12-04T12:53:07.9423331Z * [new branch] gh/bobrenjc93/679/base -> origin/gh/bobrenjc93/679/base 2025-12-04T12:53:07.9423410Z * [new branch] gh/bobrenjc93/679/head -> origin/gh/bobrenjc93/679/head 2025-12-04T12:53:07.9423483Z * [new branch] gh/bobrenjc93/679/orig -> origin/gh/bobrenjc93/679/orig 2025-12-04T12:53:07.9423555Z * [new branch] gh/bobrenjc93/680/base -> origin/gh/bobrenjc93/680/base 2025-12-04T12:53:07.9423633Z * [new branch] gh/bobrenjc93/680/head -> origin/gh/bobrenjc93/680/head 2025-12-04T12:53:07.9423706Z * [new branch] gh/bobrenjc93/680/orig -> origin/gh/bobrenjc93/680/orig 2025-12-04T12:53:07.9423777Z * [new branch] gh/bobrenjc93/681/base -> origin/gh/bobrenjc93/681/base 2025-12-04T12:53:07.9423855Z * [new branch] gh/bobrenjc93/681/head -> origin/gh/bobrenjc93/681/head 2025-12-04T12:53:07.9423927Z * [new branch] gh/bobrenjc93/681/orig -> origin/gh/bobrenjc93/681/orig 2025-12-04T12:53:07.9424037Z * [new branch] gh/bobrenjc93/682/base -> origin/gh/bobrenjc93/682/base 2025-12-04T12:53:07.9424113Z * [new branch] gh/bobrenjc93/682/head -> origin/gh/bobrenjc93/682/head 2025-12-04T12:53:07.9424188Z * [new branch] gh/bobrenjc93/682/orig -> origin/gh/bobrenjc93/682/orig 2025-12-04T12:53:07.9424267Z * [new branch] gh/bobrenjc93/683/base -> origin/gh/bobrenjc93/683/base 2025-12-04T12:53:07.9424339Z * [new branch] gh/bobrenjc93/683/head -> origin/gh/bobrenjc93/683/head 2025-12-04T12:53:07.9424411Z * [new branch] gh/bobrenjc93/683/orig -> origin/gh/bobrenjc93/683/orig 2025-12-04T12:53:07.9424485Z * [new branch] gh/bobrenjc93/684/base -> origin/gh/bobrenjc93/684/base 2025-12-04T12:53:07.9424556Z * [new branch] gh/bobrenjc93/684/head -> origin/gh/bobrenjc93/684/head 2025-12-04T12:53:07.9424629Z * [new branch] gh/bobrenjc93/684/orig -> origin/gh/bobrenjc93/684/orig 2025-12-04T12:53:07.9424704Z * [new branch] gh/bobrenjc93/685/base -> origin/gh/bobrenjc93/685/base 2025-12-04T12:53:07.9424778Z * [new branch] gh/bobrenjc93/685/head -> origin/gh/bobrenjc93/685/head 2025-12-04T12:53:07.9424850Z * [new branch] gh/bobrenjc93/685/orig -> origin/gh/bobrenjc93/685/orig 2025-12-04T12:53:07.9424969Z * [new branch] gh/bobrenjc93/686/base -> origin/gh/bobrenjc93/686/base 2025-12-04T12:53:07.9425043Z * [new branch] gh/bobrenjc93/686/head -> origin/gh/bobrenjc93/686/head 2025-12-04T12:53:07.9425115Z * [new branch] gh/bobrenjc93/686/orig -> origin/gh/bobrenjc93/686/orig 2025-12-04T12:53:07.9425189Z * [new branch] gh/bobrenjc93/687/base -> origin/gh/bobrenjc93/687/base 2025-12-04T12:53:07.9425261Z * [new branch] gh/bobrenjc93/687/head -> origin/gh/bobrenjc93/687/head 2025-12-04T12:53:07.9425339Z * [new branch] gh/bobrenjc93/687/orig -> origin/gh/bobrenjc93/687/orig 2025-12-04T12:53:07.9425416Z * [new branch] gh/bobrenjc93/688/base -> origin/gh/bobrenjc93/688/base 2025-12-04T12:53:07.9425489Z * [new branch] gh/bobrenjc93/688/head -> origin/gh/bobrenjc93/688/head 2025-12-04T12:53:07.9425562Z * [new branch] gh/bobrenjc93/688/orig -> origin/gh/bobrenjc93/688/orig 2025-12-04T12:53:07.9425645Z * [new branch] gh/bobrenjc93/689/base -> origin/gh/bobrenjc93/689/base 2025-12-04T12:53:07.9425717Z * [new branch] gh/bobrenjc93/689/head -> origin/gh/bobrenjc93/689/head 2025-12-04T12:53:07.9425793Z * [new branch] gh/bobrenjc93/689/orig -> origin/gh/bobrenjc93/689/orig 2025-12-04T12:53:07.9425865Z * [new branch] gh/bobrenjc93/690/base -> origin/gh/bobrenjc93/690/base 2025-12-04T12:53:07.9425938Z * [new branch] gh/bobrenjc93/690/head -> origin/gh/bobrenjc93/690/head 2025-12-04T12:53:07.9426013Z * [new branch] gh/bobrenjc93/690/orig -> origin/gh/bobrenjc93/690/orig 2025-12-04T12:53:07.9426086Z * [new branch] gh/bobrenjc93/691/base -> origin/gh/bobrenjc93/691/base 2025-12-04T12:53:07.9426158Z * [new branch] gh/bobrenjc93/691/head -> origin/gh/bobrenjc93/691/head 2025-12-04T12:53:07.9426235Z * [new branch] gh/bobrenjc93/691/orig -> origin/gh/bobrenjc93/691/orig 2025-12-04T12:53:07.9426309Z * [new branch] gh/bobrenjc93/692/base -> origin/gh/bobrenjc93/692/base 2025-12-04T12:53:07.9426385Z * [new branch] gh/bobrenjc93/692/head -> origin/gh/bobrenjc93/692/head 2025-12-04T12:53:07.9426462Z * [new branch] gh/bobrenjc93/692/orig -> origin/gh/bobrenjc93/692/orig 2025-12-04T12:53:07.9426533Z * [new branch] gh/bobrenjc93/693/base -> origin/gh/bobrenjc93/693/base 2025-12-04T12:53:07.9426606Z * [new branch] gh/bobrenjc93/693/head -> origin/gh/bobrenjc93/693/head 2025-12-04T12:53:07.9426710Z * [new branch] gh/bobrenjc93/693/orig -> origin/gh/bobrenjc93/693/orig 2025-12-04T12:53:07.9426785Z * [new branch] gh/bobrenjc93/694/base -> origin/gh/bobrenjc93/694/base 2025-12-04T12:53:07.9426858Z * [new branch] gh/bobrenjc93/694/head -> origin/gh/bobrenjc93/694/head 2025-12-04T12:53:07.9426933Z * [new branch] gh/bobrenjc93/694/orig -> origin/gh/bobrenjc93/694/orig 2025-12-04T12:53:07.9427005Z * [new branch] gh/bobrenjc93/695/base -> origin/gh/bobrenjc93/695/base 2025-12-04T12:53:07.9427078Z * [new branch] gh/bobrenjc93/695/head -> origin/gh/bobrenjc93/695/head 2025-12-04T12:53:07.9427154Z * [new branch] gh/bobrenjc93/695/orig -> origin/gh/bobrenjc93/695/orig 2025-12-04T12:53:07.9427223Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-12-04T12:53:07.9427295Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-12-04T12:53:07.9427361Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-12-04T12:53:07.9427423Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-12-04T12:53:07.9427487Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-12-04T12:53:07.9427579Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-12-04T12:53:07.9427643Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-12-04T12:53:07.9427709Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-12-04T12:53:07.9427772Z * [new branch] gh/c00w/56/base -> origin/gh/c00w/56/base 2025-12-04T12:53:07.9427834Z * [new branch] gh/c00w/56/head -> origin/gh/c00w/56/head 2025-12-04T12:53:07.9427898Z * [new branch] gh/c00w/56/orig -> origin/gh/c00w/56/orig 2025-12-04T12:53:07.9427962Z * [new branch] gh/c00w/57/base -> origin/gh/c00w/57/base 2025-12-04T12:53:07.9428028Z * [new branch] gh/c00w/57/head -> origin/gh/c00w/57/head 2025-12-04T12:53:07.9428093Z * [new branch] gh/c00w/57/orig -> origin/gh/c00w/57/orig 2025-12-04T12:53:07.9428154Z * [new branch] gh/c00w/58/base -> origin/gh/c00w/58/base 2025-12-04T12:53:07.9428219Z * [new branch] gh/c00w/58/head -> origin/gh/c00w/58/head 2025-12-04T12:53:07.9428283Z * [new branch] gh/c00w/58/orig -> origin/gh/c00w/58/orig 2025-12-04T12:53:07.9428356Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-12-04T12:53:07.9428425Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-12-04T12:53:07.9428605Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-12-04T12:53:07.9428684Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-12-04T12:53:07.9428762Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-12-04T12:53:07.9428848Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-12-04T12:53:07.9428928Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-12-04T12:53:07.9429006Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-12-04T12:53:07.9429085Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-12-04T12:53:07.9429165Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-12-04T12:53:07.9429250Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-12-04T12:53:07.9429328Z * [new branch] gh/coconutruben/70/base -> origin/gh/coconutruben/70/base 2025-12-04T12:53:07.9429445Z * [new branch] gh/coconutruben/70/head -> origin/gh/coconutruben/70/head 2025-12-04T12:53:07.9429531Z * [new branch] gh/coconutruben/70/orig -> origin/gh/coconutruben/70/orig 2025-12-04T12:53:07.9429611Z * [new branch] gh/coconutruben/71/base -> origin/gh/coconutruben/71/base 2025-12-04T12:53:07.9429692Z * [new branch] gh/coconutruben/71/head -> origin/gh/coconutruben/71/head 2025-12-04T12:53:07.9429780Z * [new branch] gh/coconutruben/71/orig -> origin/gh/coconutruben/71/orig 2025-12-04T12:53:07.9429861Z * [new branch] gh/coconutruben/72/base -> origin/gh/coconutruben/72/base 2025-12-04T12:53:07.9429941Z * [new branch] gh/coconutruben/72/head -> origin/gh/coconutruben/72/head 2025-12-04T12:53:07.9430026Z * [new branch] gh/coconutruben/72/orig -> origin/gh/coconutruben/72/orig 2025-12-04T12:53:07.9430106Z * [new branch] gh/coconutruben/73/base -> origin/gh/coconutruben/73/base 2025-12-04T12:53:07.9430222Z * [new branch] gh/coconutruben/73/head -> origin/gh/coconutruben/73/head 2025-12-04T12:53:07.9430308Z * [new branch] gh/coconutruben/73/orig -> origin/gh/coconutruben/73/orig 2025-12-04T12:53:07.9430388Z * [new branch] gh/coconutruben/74/base -> origin/gh/coconutruben/74/base 2025-12-04T12:53:07.9430507Z * [new branch] gh/coconutruben/74/head -> origin/gh/coconutruben/74/head 2025-12-04T12:53:07.9430595Z * [new branch] gh/coconutruben/74/orig -> origin/gh/coconutruben/74/orig 2025-12-04T12:53:07.9430675Z * [new branch] gh/coconutruben/79/base -> origin/gh/coconutruben/79/base 2025-12-04T12:53:07.9430757Z * [new branch] gh/coconutruben/79/head -> origin/gh/coconutruben/79/head 2025-12-04T12:53:07.9430842Z * [new branch] gh/coconutruben/79/orig -> origin/gh/coconutruben/79/orig 2025-12-04T12:53:07.9430923Z * [new branch] gh/coconutruben/80/base -> origin/gh/coconutruben/80/base 2025-12-04T12:53:07.9431008Z * [new branch] gh/coconutruben/80/head -> origin/gh/coconutruben/80/head 2025-12-04T12:53:07.9431088Z * [new branch] gh/coconutruben/80/orig -> origin/gh/coconutruben/80/orig 2025-12-04T12:53:07.9431168Z * [new branch] gh/coconutruben/82/base -> origin/gh/coconutruben/82/base 2025-12-04T12:53:07.9431255Z * [new branch] gh/coconutruben/82/head -> origin/gh/coconutruben/82/head 2025-12-04T12:53:07.9431335Z * [new branch] gh/coconutruben/82/orig -> origin/gh/coconutruben/82/orig 2025-12-04T12:53:07.9431415Z * [new branch] gh/coconutruben/83/base -> origin/gh/coconutruben/83/base 2025-12-04T12:53:07.9431503Z * [new branch] gh/coconutruben/83/head -> origin/gh/coconutruben/83/head 2025-12-04T12:53:07.9431582Z * [new branch] gh/coconutruben/83/orig -> origin/gh/coconutruben/83/orig 2025-12-04T12:53:07.9431663Z * [new branch] gh/coconutruben/84/base -> origin/gh/coconutruben/84/base 2025-12-04T12:53:07.9431748Z * [new branch] gh/coconutruben/84/head -> origin/gh/coconutruben/84/head 2025-12-04T12:53:07.9431826Z * [new branch] gh/coconutruben/84/orig -> origin/gh/coconutruben/84/orig 2025-12-04T12:53:07.9431906Z * [new branch] gh/coconutruben/85/base -> origin/gh/coconutruben/85/base 2025-12-04T12:53:07.9431991Z * [new branch] gh/coconutruben/85/head -> origin/gh/coconutruben/85/head 2025-12-04T12:53:07.9432070Z * [new branch] gh/coconutruben/85/orig -> origin/gh/coconutruben/85/orig 2025-12-04T12:53:07.9432149Z * [new branch] gh/coconutruben/86/base -> origin/gh/coconutruben/86/base 2025-12-04T12:53:07.9432234Z * [new branch] gh/coconutruben/86/head -> origin/gh/coconutruben/86/head 2025-12-04T12:53:07.9432312Z * [new branch] gh/coconutruben/86/orig -> origin/gh/coconutruben/86/orig 2025-12-04T12:53:07.9432441Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-12-04T12:53:07.9432522Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-12-04T12:53:07.9432598Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-12-04T12:53:07.9432680Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-12-04T12:53:07.9432757Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-12-04T12:53:07.9432832Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-12-04T12:53:07.9432913Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-12-04T12:53:07.9432988Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-12-04T12:53:07.9433057Z * [new branch] gh/d4l3k/1/base -> origin/gh/d4l3k/1/base 2025-12-04T12:53:07.9433136Z * [new branch] gh/d4l3k/1/head -> origin/gh/d4l3k/1/head 2025-12-04T12:53:07.9433203Z * [new branch] gh/d4l3k/2/base -> origin/gh/d4l3k/2/base 2025-12-04T12:53:07.9433270Z * [new branch] gh/d4l3k/2/head -> origin/gh/d4l3k/2/head 2025-12-04T12:53:07.9433376Z * [new branch] gh/d4l3k/2/orig -> origin/gh/d4l3k/2/orig 2025-12-04T12:53:07.9433444Z * [new branch] gh/d4l3k/3/base -> origin/gh/d4l3k/3/base 2025-12-04T12:53:07.9433512Z * [new branch] gh/d4l3k/3/head -> origin/gh/d4l3k/3/head 2025-12-04T12:53:07.9433585Z * [new branch] gh/d4l3k/3/orig -> origin/gh/d4l3k/3/orig 2025-12-04T12:53:07.9433653Z * [new branch] gh/d4l3k/4/base -> origin/gh/d4l3k/4/base 2025-12-04T12:53:07.9433720Z * [new branch] gh/d4l3k/4/head -> origin/gh/d4l3k/4/head 2025-12-04T12:53:07.9433796Z * [new branch] gh/d4l3k/4/orig -> origin/gh/d4l3k/4/orig 2025-12-04T12:53:07.9433863Z * [new branch] gh/d4l3k/5/base -> origin/gh/d4l3k/5/base 2025-12-04T12:53:07.9433930Z * [new branch] gh/d4l3k/5/orig -> origin/gh/d4l3k/5/orig 2025-12-04T12:53:07.9434027Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-12-04T12:53:07.9434116Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-12-04T12:53:07.9434202Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-12-04T12:53:07.9434292Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-12-04T12:53:07.9434378Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-12-04T12:53:07.9434465Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-12-04T12:53:07.9434545Z * [new branch] gh/desertfire/605/base -> origin/gh/desertfire/605/base 2025-12-04T12:53:07.9434622Z * [new branch] gh/desertfire/605/head -> origin/gh/desertfire/605/head 2025-12-04T12:53:07.9434706Z * [new branch] gh/desertfire/605/orig -> origin/gh/desertfire/605/orig 2025-12-04T12:53:07.9434782Z * [new branch] gh/desertfire/606/base -> origin/gh/desertfire/606/base 2025-12-04T12:53:07.9434859Z * [new branch] gh/desertfire/606/head -> origin/gh/desertfire/606/head 2025-12-04T12:53:07.9434940Z * [new branch] gh/desertfire/606/orig -> origin/gh/desertfire/606/orig 2025-12-04T12:53:07.9435017Z * [new branch] gh/desertfire/607/base -> origin/gh/desertfire/607/base 2025-12-04T12:53:07.9435094Z * [new branch] gh/desertfire/607/head -> origin/gh/desertfire/607/head 2025-12-04T12:53:07.9435204Z * [new branch] gh/desertfire/607/orig -> origin/gh/desertfire/607/orig 2025-12-04T12:53:07.9435280Z * [new branch] gh/desertfire/608/base -> origin/gh/desertfire/608/base 2025-12-04T12:53:07.9435357Z * [new branch] gh/desertfire/608/head -> origin/gh/desertfire/608/head 2025-12-04T12:53:07.9435438Z * [new branch] gh/desertfire/608/orig -> origin/gh/desertfire/608/orig 2025-12-04T12:53:07.9435515Z * [new branch] gh/desertfire/609/base -> origin/gh/desertfire/609/base 2025-12-04T12:53:07.9435591Z * [new branch] gh/desertfire/609/head -> origin/gh/desertfire/609/head 2025-12-04T12:53:07.9435672Z * [new branch] gh/desertfire/609/orig -> origin/gh/desertfire/609/orig 2025-12-04T12:53:07.9435748Z * [new branch] gh/desertfire/610/base -> origin/gh/desertfire/610/base 2025-12-04T12:53:07.9435821Z * [new branch] gh/desertfire/610/head -> origin/gh/desertfire/610/head 2025-12-04T12:53:07.9435902Z * [new branch] gh/desertfire/610/orig -> origin/gh/desertfire/610/orig 2025-12-04T12:53:07.9435977Z * [new branch] gh/desertfire/611/base -> origin/gh/desertfire/611/base 2025-12-04T12:53:07.9436049Z * [new branch] gh/desertfire/611/head -> origin/gh/desertfire/611/head 2025-12-04T12:53:07.9436168Z * [new branch] gh/desertfire/611/orig -> origin/gh/desertfire/611/orig 2025-12-04T12:53:07.9436243Z * [new branch] gh/desertfire/612/base -> origin/gh/desertfire/612/base 2025-12-04T12:53:07.9436326Z * [new branch] gh/desertfire/612/head -> origin/gh/desertfire/612/head 2025-12-04T12:53:07.9436405Z * [new branch] gh/desertfire/612/orig -> origin/gh/desertfire/612/orig 2025-12-04T12:53:07.9436482Z * [new branch] gh/desertfire/613/base -> origin/gh/desertfire/613/base 2025-12-04T12:53:07.9436558Z * [new branch] gh/desertfire/613/head -> origin/gh/desertfire/613/head 2025-12-04T12:53:07.9436635Z * [new branch] gh/desertfire/613/orig -> origin/gh/desertfire/613/orig 2025-12-04T12:53:07.9436708Z * [new branch] gh/desertfire/614/base -> origin/gh/desertfire/614/base 2025-12-04T12:53:07.9436784Z * [new branch] gh/desertfire/614/head -> origin/gh/desertfire/614/head 2025-12-04T12:53:07.9436859Z * [new branch] gh/desertfire/614/orig -> origin/gh/desertfire/614/orig 2025-12-04T12:53:07.9436933Z * [new branch] gh/desertfire/615/base -> origin/gh/desertfire/615/base 2025-12-04T12:53:07.9437009Z * [new branch] gh/desertfire/615/head -> origin/gh/desertfire/615/head 2025-12-04T12:53:07.9437082Z * [new branch] gh/desertfire/615/orig -> origin/gh/desertfire/615/orig 2025-12-04T12:53:07.9437155Z * [new branch] gh/desertfire/616/base -> origin/gh/desertfire/616/base 2025-12-04T12:53:07.9437230Z * [new branch] gh/desertfire/616/head -> origin/gh/desertfire/616/head 2025-12-04T12:53:07.9437304Z * [new branch] gh/desertfire/616/orig -> origin/gh/desertfire/616/orig 2025-12-04T12:53:07.9437378Z * [new branch] gh/desertfire/617/base -> origin/gh/desertfire/617/base 2025-12-04T12:53:07.9437453Z * [new branch] gh/desertfire/617/head -> origin/gh/desertfire/617/head 2025-12-04T12:53:07.9437527Z * [new branch] gh/desertfire/617/orig -> origin/gh/desertfire/617/orig 2025-12-04T12:53:07.9437598Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-12-04T12:53:07.9437671Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-12-04T12:53:07.9437745Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-12-04T12:53:07.9437824Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-12-04T12:53:07.9437894Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-12-04T12:53:07.9437989Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-12-04T12:53:07.9438061Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-12-04T12:53:07.9438130Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-12-04T12:53:07.9438200Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-12-04T12:53:07.9438272Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-12-04T12:53:07.9438340Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-12-04T12:53:07.9438409Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-12-04T12:53:07.9438480Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-12-04T12:53:07.9438549Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-12-04T12:53:07.9438619Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-12-04T12:53:07.9438691Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-12-04T12:53:07.9438760Z * [new branch] gh/drisspg/200/base -> origin/gh/drisspg/200/base 2025-12-04T12:53:07.9438854Z * [new branch] gh/drisspg/200/head -> origin/gh/drisspg/200/head 2025-12-04T12:53:07.9438926Z * [new branch] gh/drisspg/200/orig -> origin/gh/drisspg/200/orig 2025-12-04T12:53:07.9438996Z * [new branch] gh/drisspg/218/base -> origin/gh/drisspg/218/base 2025-12-04T12:53:07.9439065Z * [new branch] gh/drisspg/218/head -> origin/gh/drisspg/218/head 2025-12-04T12:53:07.9439138Z * [new branch] gh/drisspg/218/orig -> origin/gh/drisspg/218/orig 2025-12-04T12:53:07.9439209Z * [new branch] gh/drisspg/219/base -> origin/gh/drisspg/219/base 2025-12-04T12:53:07.9439279Z * [new branch] gh/drisspg/219/head -> origin/gh/drisspg/219/head 2025-12-04T12:53:07.9439351Z * [new branch] gh/drisspg/219/orig -> origin/gh/drisspg/219/orig 2025-12-04T12:53:07.9439419Z * [new branch] gh/drisspg/220/base -> origin/gh/drisspg/220/base 2025-12-04T12:53:07.9439495Z * [new branch] gh/drisspg/220/head -> origin/gh/drisspg/220/head 2025-12-04T12:53:07.9439565Z * [new branch] gh/drisspg/220/orig -> origin/gh/drisspg/220/orig 2025-12-04T12:53:07.9439634Z * [new branch] gh/drisspg/221/base -> origin/gh/drisspg/221/base 2025-12-04T12:53:07.9439707Z * [new branch] gh/drisspg/221/head -> origin/gh/drisspg/221/head 2025-12-04T12:53:07.9439776Z * [new branch] gh/drisspg/221/orig -> origin/gh/drisspg/221/orig 2025-12-04T12:53:07.9439846Z * [new branch] gh/drisspg/222/base -> origin/gh/drisspg/222/base 2025-12-04T12:53:07.9439918Z * [new branch] gh/drisspg/222/head -> origin/gh/drisspg/222/head 2025-12-04T12:53:07.9439987Z * [new branch] gh/drisspg/222/orig -> origin/gh/drisspg/222/orig 2025-12-04T12:53:07.9440055Z * [new branch] gh/drisspg/223/base -> origin/gh/drisspg/223/base 2025-12-04T12:53:07.9440128Z * [new branch] gh/drisspg/223/head -> origin/gh/drisspg/223/head 2025-12-04T12:53:07.9440233Z * [new branch] gh/drisspg/223/orig -> origin/gh/drisspg/223/orig 2025-12-04T12:53:07.9440304Z * [new branch] gh/drisspg/224/base -> origin/gh/drisspg/224/base 2025-12-04T12:53:07.9440378Z * [new branch] gh/drisspg/224/head -> origin/gh/drisspg/224/head 2025-12-04T12:53:07.9440446Z * [new branch] gh/drisspg/224/orig -> origin/gh/drisspg/224/orig 2025-12-04T12:53:07.9440558Z * [new branch] gh/drisspg/225/base -> origin/gh/drisspg/225/base 2025-12-04T12:53:07.9440629Z * [new branch] gh/drisspg/225/head -> origin/gh/drisspg/225/head 2025-12-04T12:53:07.9440701Z * [new branch] gh/drisspg/225/orig -> origin/gh/drisspg/225/orig 2025-12-04T12:53:07.9440770Z * [new branch] gh/drisspg/226/base -> origin/gh/drisspg/226/base 2025-12-04T12:53:07.9440843Z * [new branch] gh/drisspg/226/head -> origin/gh/drisspg/226/head 2025-12-04T12:53:07.9440912Z * [new branch] gh/drisspg/226/orig -> origin/gh/drisspg/226/orig 2025-12-04T12:53:07.9440981Z * [new branch] gh/drisspg/227/base -> origin/gh/drisspg/227/base 2025-12-04T12:53:07.9441052Z * [new branch] gh/drisspg/227/head -> origin/gh/drisspg/227/head 2025-12-04T12:53:07.9441120Z * [new branch] gh/drisspg/227/orig -> origin/gh/drisspg/227/orig 2025-12-04T12:53:07.9441192Z * [new branch] gh/drisspg/228/base -> origin/gh/drisspg/228/base 2025-12-04T12:53:07.9441263Z * [new branch] gh/drisspg/228/head -> origin/gh/drisspg/228/head 2025-12-04T12:53:07.9441331Z * [new branch] gh/drisspg/228/orig -> origin/gh/drisspg/228/orig 2025-12-04T12:53:07.9441402Z * [new branch] gh/drisspg/229/base -> origin/gh/drisspg/229/base 2025-12-04T12:53:07.9441512Z * [new branch] gh/drisspg/229/head -> origin/gh/drisspg/229/head 2025-12-04T12:53:07.9441581Z * [new branch] gh/drisspg/229/orig -> origin/gh/drisspg/229/orig 2025-12-04T12:53:07.9441654Z * [new branch] gh/drisspg/230/base -> origin/gh/drisspg/230/base 2025-12-04T12:53:07.9441723Z * [new branch] gh/drisspg/230/head -> origin/gh/drisspg/230/head 2025-12-04T12:53:07.9441792Z * [new branch] gh/drisspg/230/orig -> origin/gh/drisspg/230/orig 2025-12-04T12:53:07.9441869Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-12-04T12:53:07.9441940Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-12-04T12:53:07.9442018Z * [new branch] gh/dzmitry-huba/1/base -> origin/gh/dzmitry-huba/1/base 2025-12-04T12:53:07.9442101Z * [new branch] gh/dzmitry-huba/1/head -> origin/gh/dzmitry-huba/1/head 2025-12-04T12:53:07.9442181Z * [new branch] gh/dzmitry-huba/12/base -> origin/gh/dzmitry-huba/12/base 2025-12-04T12:53:07.9442260Z * [new branch] gh/dzmitry-huba/12/head -> origin/gh/dzmitry-huba/12/head 2025-12-04T12:53:07.9442339Z * [new branch] gh/dzmitry-huba/12/orig -> origin/gh/dzmitry-huba/12/orig 2025-12-04T12:53:07.9442415Z * [new branch] gh/dzmitry-huba/13/base -> origin/gh/dzmitry-huba/13/base 2025-12-04T12:53:07.9442489Z * [new branch] gh/dzmitry-huba/13/head -> origin/gh/dzmitry-huba/13/head 2025-12-04T12:53:07.9442568Z * [new branch] gh/dzmitry-huba/13/orig -> origin/gh/dzmitry-huba/13/orig 2025-12-04T12:53:07.9442643Z * [new branch] gh/dzmitry-huba/14/base -> origin/gh/dzmitry-huba/14/base 2025-12-04T12:53:07.9442717Z * [new branch] gh/dzmitry-huba/14/head -> origin/gh/dzmitry-huba/14/head 2025-12-04T12:53:07.9442797Z * [new branch] gh/dzmitry-huba/14/orig -> origin/gh/dzmitry-huba/14/orig 2025-12-04T12:53:07.9442871Z * [new branch] gh/dzmitry-huba/15/base -> origin/gh/dzmitry-huba/15/base 2025-12-04T12:53:07.9442947Z * [new branch] gh/dzmitry-huba/15/head -> origin/gh/dzmitry-huba/15/head 2025-12-04T12:53:07.9443021Z * [new branch] gh/dzmitry-huba/15/orig -> origin/gh/dzmitry-huba/15/orig 2025-12-04T12:53:07.9443095Z * [new branch] gh/dzmitry-huba/16/base -> origin/gh/dzmitry-huba/16/base 2025-12-04T12:53:07.9443172Z * [new branch] gh/dzmitry-huba/16/head -> origin/gh/dzmitry-huba/16/head 2025-12-04T12:53:07.9443275Z * [new branch] gh/dzmitry-huba/16/orig -> origin/gh/dzmitry-huba/16/orig 2025-12-04T12:53:07.9443349Z * [new branch] gh/dzmitry-huba/17/base -> origin/gh/dzmitry-huba/17/base 2025-12-04T12:53:07.9443426Z * [new branch] gh/dzmitry-huba/17/head -> origin/gh/dzmitry-huba/17/head 2025-12-04T12:53:07.9443502Z * [new branch] gh/dzmitry-huba/17/orig -> origin/gh/dzmitry-huba/17/orig 2025-12-04T12:53:07.9443578Z * [new branch] gh/dzmitry-huba/2/base -> origin/gh/dzmitry-huba/2/base 2025-12-04T12:53:07.9443656Z * [new branch] gh/dzmitry-huba/2/head -> origin/gh/dzmitry-huba/2/head 2025-12-04T12:53:07.9443731Z * [new branch] gh/dzmitry-huba/3/base -> origin/gh/dzmitry-huba/3/base 2025-12-04T12:53:07.9443806Z * [new branch] gh/dzmitry-huba/3/head -> origin/gh/dzmitry-huba/3/head 2025-12-04T12:53:07.9443883Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-12-04T12:53:07.9443959Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-12-04T12:53:07.9444032Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-12-04T12:53:07.9444106Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-12-04T12:53:07.9444202Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-12-04T12:53:07.9444274Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-12-04T12:53:07.9444347Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-12-04T12:53:07.9444418Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-12-04T12:53:07.9444488Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-12-04T12:53:07.9444563Z * [new branch] gh/eellison/862/base -> origin/gh/eellison/862/base 2025-12-04T12:53:07.9444636Z * [new branch] gh/eellison/862/head -> origin/gh/eellison/862/head 2025-12-04T12:53:07.9444708Z * [new branch] gh/eellison/862/orig -> origin/gh/eellison/862/orig 2025-12-04T12:53:07.9444781Z * [new branch] gh/eellison/863/base -> origin/gh/eellison/863/base 2025-12-04T12:53:07.9444852Z * [new branch] gh/eellison/863/head -> origin/gh/eellison/863/head 2025-12-04T12:53:07.9444925Z * [new branch] gh/eellison/863/orig -> origin/gh/eellison/863/orig 2025-12-04T12:53:07.9444997Z * [new branch] gh/eellison/864/base -> origin/gh/eellison/864/base 2025-12-04T12:53:07.9445068Z * [new branch] gh/eellison/864/head -> origin/gh/eellison/864/head 2025-12-04T12:53:07.9445144Z * [new branch] gh/eellison/864/orig -> origin/gh/eellison/864/orig 2025-12-04T12:53:07.9445215Z * [new branch] gh/eellison/865/base -> origin/gh/eellison/865/base 2025-12-04T12:53:07.9445286Z * [new branch] gh/eellison/865/head -> origin/gh/eellison/865/head 2025-12-04T12:53:07.9445358Z * [new branch] gh/eellison/865/orig -> origin/gh/eellison/865/orig 2025-12-04T12:53:07.9445429Z * [new branch] gh/eellison/866/base -> origin/gh/eellison/866/base 2025-12-04T12:53:07.9445501Z * [new branch] gh/eellison/866/head -> origin/gh/eellison/866/head 2025-12-04T12:53:07.9445575Z * [new branch] gh/eellison/866/orig -> origin/gh/eellison/866/orig 2025-12-04T12:53:07.9445646Z * [new branch] gh/eellison/867/base -> origin/gh/eellison/867/base 2025-12-04T12:53:07.9445716Z * [new branch] gh/eellison/867/head -> origin/gh/eellison/867/head 2025-12-04T12:53:07.9445790Z * [new branch] gh/eellison/867/orig -> origin/gh/eellison/867/orig 2025-12-04T12:53:07.9445898Z * [new branch] gh/eellison/868/base -> origin/gh/eellison/868/base 2025-12-04T12:53:07.9445969Z * [new branch] gh/eellison/868/head -> origin/gh/eellison/868/head 2025-12-04T12:53:07.9446042Z * [new branch] gh/eellison/868/orig -> origin/gh/eellison/868/orig 2025-12-04T12:53:07.9446112Z * [new branch] gh/eellison/869/base -> origin/gh/eellison/869/base 2025-12-04T12:53:07.9446186Z * [new branch] gh/eellison/869/head -> origin/gh/eellison/869/head 2025-12-04T12:53:07.9446257Z * [new branch] gh/eellison/869/orig -> origin/gh/eellison/869/orig 2025-12-04T12:53:07.9446328Z * [new branch] gh/eellison/870/base -> origin/gh/eellison/870/base 2025-12-04T12:53:07.9446403Z * [new branch] gh/eellison/870/head -> origin/gh/eellison/870/head 2025-12-04T12:53:07.9446477Z * [new branch] gh/eellison/870/orig -> origin/gh/eellison/870/orig 2025-12-04T12:53:07.9446550Z * [new branch] gh/eellison/871/base -> origin/gh/eellison/871/base 2025-12-04T12:53:07.9446623Z * [new branch] gh/eellison/871/head -> origin/gh/eellison/871/head 2025-12-04T12:53:07.9446696Z * [new branch] gh/eellison/871/orig -> origin/gh/eellison/871/orig 2025-12-04T12:53:07.9446766Z * [new branch] gh/eellison/872/base -> origin/gh/eellison/872/base 2025-12-04T12:53:07.9446868Z * [new branch] gh/eellison/872/head -> origin/gh/eellison/872/head 2025-12-04T12:53:07.9446939Z * [new branch] gh/eellison/872/orig -> origin/gh/eellison/872/orig 2025-12-04T12:53:07.9447011Z * [new branch] gh/eellison/873/base -> origin/gh/eellison/873/base 2025-12-04T12:53:07.9447083Z * [new branch] gh/eellison/873/head -> origin/gh/eellison/873/head 2025-12-04T12:53:07.9447154Z * [new branch] gh/eellison/873/orig -> origin/gh/eellison/873/orig 2025-12-04T12:53:07.9447227Z * [new branch] gh/eellison/874/base -> origin/gh/eellison/874/base 2025-12-04T12:53:07.9447300Z * [new branch] gh/eellison/874/head -> origin/gh/eellison/874/head 2025-12-04T12:53:07.9447370Z * [new branch] gh/eellison/874/orig -> origin/gh/eellison/874/orig 2025-12-04T12:53:07.9447443Z * [new branch] gh/eellison/875/base -> origin/gh/eellison/875/base 2025-12-04T12:53:07.9447519Z * [new branch] gh/eellison/875/head -> origin/gh/eellison/875/head 2025-12-04T12:53:07.9447589Z * [new branch] gh/eellison/875/orig -> origin/gh/eellison/875/orig 2025-12-04T12:53:07.9447662Z * [new branch] gh/eellison/876/base -> origin/gh/eellison/876/base 2025-12-04T12:53:07.9447733Z * [new branch] gh/eellison/876/head -> origin/gh/eellison/876/head 2025-12-04T12:53:07.9447804Z * [new branch] gh/eellison/876/orig -> origin/gh/eellison/876/orig 2025-12-04T12:53:07.9447878Z * [new branch] gh/eellison/877/base -> origin/gh/eellison/877/base 2025-12-04T12:53:07.9447950Z * [new branch] gh/eellison/877/head -> origin/gh/eellison/877/head 2025-12-04T12:53:07.9448020Z * [new branch] gh/eellison/877/orig -> origin/gh/eellison/877/orig 2025-12-04T12:53:07.9448094Z * [new branch] gh/eellison/878/base -> origin/gh/eellison/878/base 2025-12-04T12:53:07.9448165Z * [new branch] gh/eellison/878/head -> origin/gh/eellison/878/head 2025-12-04T12:53:07.9448235Z * [new branch] gh/eellison/878/orig -> origin/gh/eellison/878/orig 2025-12-04T12:53:07.9448309Z * [new branch] gh/eellison/879/base -> origin/gh/eellison/879/base 2025-12-04T12:53:07.9448379Z * [new branch] gh/eellison/879/head -> origin/gh/eellison/879/head 2025-12-04T12:53:07.9448450Z * [new branch] gh/eellison/879/orig -> origin/gh/eellison/879/orig 2025-12-04T12:53:07.9448557Z * [new branch] gh/eellison/880/base -> origin/gh/eellison/880/base 2025-12-04T12:53:07.9448627Z * [new branch] gh/eellison/880/head -> origin/gh/eellison/880/head 2025-12-04T12:53:07.9448698Z * [new branch] gh/eellison/880/orig -> origin/gh/eellison/880/orig 2025-12-04T12:53:07.9448772Z * [new branch] gh/eellison/881/base -> origin/gh/eellison/881/base 2025-12-04T12:53:07.9448842Z * [new branch] gh/eellison/881/head -> origin/gh/eellison/881/head 2025-12-04T12:53:07.9448913Z * [new branch] gh/eellison/881/orig -> origin/gh/eellison/881/orig 2025-12-04T12:53:07.9448985Z * [new branch] gh/eellison/882/base -> origin/gh/eellison/882/base 2025-12-04T12:53:07.9449055Z * [new branch] gh/eellison/882/head -> origin/gh/eellison/882/head 2025-12-04T12:53:07.9449125Z * [new branch] gh/eellison/882/orig -> origin/gh/eellison/882/orig 2025-12-04T12:53:07.9449199Z * [new branch] gh/eellison/883/base -> origin/gh/eellison/883/base 2025-12-04T12:53:07.9449269Z * [new branch] gh/eellison/883/head -> origin/gh/eellison/883/head 2025-12-04T12:53:07.9449343Z * [new branch] gh/eellison/883/orig -> origin/gh/eellison/883/orig 2025-12-04T12:53:07.9449440Z * [new branch] gh/eellison/884/base -> origin/gh/eellison/884/base 2025-12-04T12:53:07.9449513Z * [new branch] gh/eellison/884/head -> origin/gh/eellison/884/head 2025-12-04T12:53:07.9449587Z * [new branch] gh/eellison/884/orig -> origin/gh/eellison/884/orig 2025-12-04T12:53:07.9449656Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-12-04T12:53:07.9449723Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-12-04T12:53:07.9449792Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-12-04T12:53:07.9449859Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-12-04T12:53:07.9449923Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-12-04T12:53:07.9449990Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-12-04T12:53:07.9450057Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-12-04T12:53:07.9450122Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-12-04T12:53:07.9450228Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-12-04T12:53:07.9450295Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-12-04T12:53:07.9450359Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-12-04T12:53:07.9450426Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-12-04T12:53:07.9450491Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-12-04T12:53:07.9450556Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-12-04T12:53:07.9450624Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-12-04T12:53:07.9450688Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-12-04T12:53:07.9450756Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-12-04T12:53:07.9450827Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-12-04T12:53:07.9450893Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-12-04T12:53:07.9450962Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-12-04T12:53:07.9451027Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-12-04T12:53:07.9451139Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-12-04T12:53:07.9451205Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-12-04T12:53:07.9451270Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-12-04T12:53:07.9451335Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-12-04T12:53:07.9451403Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-12-04T12:53:07.9451469Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-12-04T12:53:07.9451535Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-12-04T12:53:07.9451607Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-12-04T12:53:07.9451672Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-12-04T12:53:07.9451736Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-12-04T12:53:07.9451806Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-12-04T12:53:07.9451870Z * [new branch] gh/etaf/172/base -> origin/gh/etaf/172/base 2025-12-04T12:53:07.9451934Z * [new branch] gh/etaf/172/head -> origin/gh/etaf/172/head 2025-12-04T12:53:07.9452041Z * [new branch] gh/etaf/172/orig -> origin/gh/etaf/172/orig 2025-12-04T12:53:07.9452108Z * [new branch] gh/etaf/173/base -> origin/gh/etaf/173/base 2025-12-04T12:53:07.9452173Z * [new branch] gh/etaf/173/head -> origin/gh/etaf/173/head 2025-12-04T12:53:07.9452242Z * [new branch] gh/etaf/173/orig -> origin/gh/etaf/173/orig 2025-12-04T12:53:07.9452308Z * [new branch] gh/etaf/174/base -> origin/gh/etaf/174/base 2025-12-04T12:53:07.9452372Z * [new branch] gh/etaf/174/head -> origin/gh/etaf/174/head 2025-12-04T12:53:07.9452441Z * [new branch] gh/etaf/175/base -> origin/gh/etaf/175/base 2025-12-04T12:53:07.9452506Z * [new branch] gh/etaf/175/head -> origin/gh/etaf/175/head 2025-12-04T12:53:07.9452574Z * [new branch] gh/etaf/175/orig -> origin/gh/etaf/175/orig 2025-12-04T12:53:07.9452640Z * [new branch] gh/etaf/176/base -> origin/gh/etaf/176/base 2025-12-04T12:53:07.9452704Z * [new branch] gh/etaf/176/head -> origin/gh/etaf/176/head 2025-12-04T12:53:07.9452770Z * [new branch] gh/etaf/176/orig -> origin/gh/etaf/176/orig 2025-12-04T12:53:07.9452834Z * [new branch] gh/etaf/177/base -> origin/gh/etaf/177/base 2025-12-04T12:53:07.9452900Z * [new branch] gh/etaf/177/head -> origin/gh/etaf/177/head 2025-12-04T12:53:07.9452967Z * [new branch] gh/etaf/177/orig -> origin/gh/etaf/177/orig 2025-12-04T12:53:07.9453033Z * [new branch] gh/etaf/178/base -> origin/gh/etaf/178/base 2025-12-04T12:53:07.9453097Z * [new branch] gh/etaf/178/head -> origin/gh/etaf/178/head 2025-12-04T12:53:07.9453165Z * [new branch] gh/etaf/178/orig -> origin/gh/etaf/178/orig 2025-12-04T12:53:07.9453232Z * [new branch] gh/etaf/179/base -> origin/gh/etaf/179/base 2025-12-04T12:53:07.9453298Z * [new branch] gh/etaf/179/head -> origin/gh/etaf/179/head 2025-12-04T12:53:07.9453365Z * [new branch] gh/etaf/179/orig -> origin/gh/etaf/179/orig 2025-12-04T12:53:07.9453429Z * [new branch] gh/etaf/180/base -> origin/gh/etaf/180/base 2025-12-04T12:53:07.9453493Z * [new branch] gh/etaf/180/head -> origin/gh/etaf/180/head 2025-12-04T12:53:07.9453561Z * [new branch] gh/etaf/180/orig -> origin/gh/etaf/180/orig 2025-12-04T12:53:07.9453670Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-12-04T12:53:07.9453751Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-12-04T12:53:07.9453835Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-12-04T12:53:07.9453914Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-12-04T12:53:07.9453989Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-12-04T12:53:07.9454067Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-12-04T12:53:07.9454143Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-12-04T12:53:07.9454223Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-12-04T12:53:07.9454296Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-12-04T12:53:07.9454368Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-12-04T12:53:07.9454441Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-12-04T12:53:07.9454512Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-12-04T12:53:07.9454607Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-12-04T12:53:07.9454683Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-12-04T12:53:07.9454754Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-12-04T12:53:07.9454822Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-12-04T12:53:07.9454893Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-12-04T12:53:07.9454963Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-12-04T12:53:07.9455036Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-12-04T12:53:07.9455108Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-12-04T12:53:07.9455181Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-12-04T12:53:07.9455253Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-12-04T12:53:07.9455325Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-12-04T12:53:07.9455393Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-12-04T12:53:07.9455462Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-12-04T12:53:07.9455532Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-12-04T12:53:07.9455601Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-12-04T12:53:07.9455672Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-12-04T12:53:07.9455747Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-12-04T12:53:07.9455816Z * [new branch] gh/ezyang/3144/base -> origin/gh/ezyang/3144/base 2025-12-04T12:53:07.9455888Z * [new branch] gh/ezyang/3144/head -> origin/gh/ezyang/3144/head 2025-12-04T12:53:07.9455957Z * [new branch] gh/ezyang/3144/orig -> origin/gh/ezyang/3144/orig 2025-12-04T12:53:07.9456026Z * [new branch] gh/ezyang/3167/base -> origin/gh/ezyang/3167/base 2025-12-04T12:53:07.9456102Z * [new branch] gh/ezyang/3167/head -> origin/gh/ezyang/3167/head 2025-12-04T12:53:07.9456171Z * [new branch] gh/ezyang/3167/orig -> origin/gh/ezyang/3167/orig 2025-12-04T12:53:07.9456240Z * [new branch] gh/ezyang/3173/base -> origin/gh/ezyang/3173/base 2025-12-04T12:53:07.9456352Z * [new branch] gh/ezyang/3173/head -> origin/gh/ezyang/3173/head 2025-12-04T12:53:07.9456422Z * [new branch] gh/ezyang/3173/orig -> origin/gh/ezyang/3173/orig 2025-12-04T12:53:07.9456489Z * [new branch] gh/ezyang/3175/base -> origin/gh/ezyang/3175/base 2025-12-04T12:53:07.9456562Z * [new branch] gh/ezyang/3175/head -> origin/gh/ezyang/3175/head 2025-12-04T12:53:07.9456632Z * [new branch] gh/ezyang/3175/orig -> origin/gh/ezyang/3175/orig 2025-12-04T12:53:07.9456702Z * [new branch] gh/ezyang/3182/base -> origin/gh/ezyang/3182/base 2025-12-04T12:53:07.9456775Z * [new branch] gh/ezyang/3182/head -> origin/gh/ezyang/3182/head 2025-12-04T12:53:07.9456844Z * [new branch] gh/ezyang/3182/orig -> origin/gh/ezyang/3182/orig 2025-12-04T12:53:07.9456912Z * [new branch] gh/ezyang/3185/base -> origin/gh/ezyang/3185/base 2025-12-04T12:53:07.9456985Z * [new branch] gh/ezyang/3185/head -> origin/gh/ezyang/3185/head 2025-12-04T12:53:07.9457054Z * [new branch] gh/ezyang/3185/orig -> origin/gh/ezyang/3185/orig 2025-12-04T12:53:07.9457125Z * [new branch] gh/ezyang/3189/base -> origin/gh/ezyang/3189/base 2025-12-04T12:53:07.9457236Z * [new branch] gh/ezyang/3189/head -> origin/gh/ezyang/3189/head 2025-12-04T12:53:07.9457306Z * [new branch] gh/ezyang/3189/orig -> origin/gh/ezyang/3189/orig 2025-12-04T12:53:07.9457374Z * [new branch] gh/ezyang/3191/base -> origin/gh/ezyang/3191/base 2025-12-04T12:53:07.9457445Z * [new branch] gh/ezyang/3191/head -> origin/gh/ezyang/3191/head 2025-12-04T12:53:07.9457512Z * [new branch] gh/ezyang/3191/orig -> origin/gh/ezyang/3191/orig 2025-12-04T12:53:07.9457583Z * [new branch] gh/ezyang/3192/base -> origin/gh/ezyang/3192/base 2025-12-04T12:53:07.9457654Z * [new branch] gh/ezyang/3192/head -> origin/gh/ezyang/3192/head 2025-12-04T12:53:07.9475067Z * [new branch] gh/ezyang/3192/orig -> origin/gh/ezyang/3192/orig 2025-12-04T12:53:07.9475162Z * [new branch] gh/ezyang/3193/base -> origin/gh/ezyang/3193/base 2025-12-04T12:53:07.9475240Z * [new branch] gh/ezyang/3193/head -> origin/gh/ezyang/3193/head 2025-12-04T12:53:07.9475317Z * [new branch] gh/ezyang/3193/orig -> origin/gh/ezyang/3193/orig 2025-12-04T12:53:07.9475388Z * [new branch] gh/ezyang/3194/base -> origin/gh/ezyang/3194/base 2025-12-04T12:53:07.9475460Z * [new branch] gh/ezyang/3194/head -> origin/gh/ezyang/3194/head 2025-12-04T12:53:07.9475539Z * [new branch] gh/ezyang/3194/orig -> origin/gh/ezyang/3194/orig 2025-12-04T12:53:07.9475612Z * [new branch] gh/ezyang/3195/base -> origin/gh/ezyang/3195/base 2025-12-04T12:53:07.9475685Z * [new branch] gh/ezyang/3195/head -> origin/gh/ezyang/3195/head 2025-12-04T12:53:07.9475759Z * [new branch] gh/ezyang/3195/orig -> origin/gh/ezyang/3195/orig 2025-12-04T12:53:07.9475829Z * [new branch] gh/ezyang/3196/base -> origin/gh/ezyang/3196/base 2025-12-04T12:53:07.9475899Z * [new branch] gh/ezyang/3196/head -> origin/gh/ezyang/3196/head 2025-12-04T12:53:07.9475972Z * [new branch] gh/ezyang/3196/orig -> origin/gh/ezyang/3196/orig 2025-12-04T12:53:07.9476042Z * [new branch] gh/ezyang/3197/base -> origin/gh/ezyang/3197/base 2025-12-04T12:53:07.9476113Z * [new branch] gh/ezyang/3197/head -> origin/gh/ezyang/3197/head 2025-12-04T12:53:07.9476183Z * [new branch] gh/ezyang/3197/orig -> origin/gh/ezyang/3197/orig 2025-12-04T12:53:07.9476253Z * [new branch] gh/ezyang/3198/base -> origin/gh/ezyang/3198/base 2025-12-04T12:53:07.9476408Z * [new branch] gh/ezyang/3198/head -> origin/gh/ezyang/3198/head 2025-12-04T12:53:07.9476477Z * [new branch] gh/ezyang/3198/orig -> origin/gh/ezyang/3198/orig 2025-12-04T12:53:07.9476546Z * [new branch] gh/ezyang/3199/base -> origin/gh/ezyang/3199/base 2025-12-04T12:53:07.9476618Z * [new branch] gh/ezyang/3199/head -> origin/gh/ezyang/3199/head 2025-12-04T12:53:07.9476686Z * [new branch] gh/ezyang/3199/orig -> origin/gh/ezyang/3199/orig 2025-12-04T12:53:07.9476755Z * [new branch] gh/ezyang/3200/base -> origin/gh/ezyang/3200/base 2025-12-04T12:53:07.9476827Z * [new branch] gh/ezyang/3200/head -> origin/gh/ezyang/3200/head 2025-12-04T12:53:07.9476896Z * [new branch] gh/ezyang/3200/orig -> origin/gh/ezyang/3200/orig 2025-12-04T12:53:07.9476966Z * [new branch] gh/ezyang/3201/base -> origin/gh/ezyang/3201/base 2025-12-04T12:53:07.9477040Z * [new branch] gh/ezyang/3201/head -> origin/gh/ezyang/3201/head 2025-12-04T12:53:07.9477109Z * [new branch] gh/ezyang/3201/orig -> origin/gh/ezyang/3201/orig 2025-12-04T12:53:07.9477179Z * [new branch] gh/ezyang/3202/base -> origin/gh/ezyang/3202/base 2025-12-04T12:53:07.9477310Z * [new branch] gh/ezyang/3202/head -> origin/gh/ezyang/3202/head 2025-12-04T12:53:07.9477380Z * [new branch] gh/ezyang/3202/orig -> origin/gh/ezyang/3202/orig 2025-12-04T12:53:07.9477448Z * [new branch] gh/ezyang/3203/base -> origin/gh/ezyang/3203/base 2025-12-04T12:53:07.9477518Z * [new branch] gh/ezyang/3203/head -> origin/gh/ezyang/3203/head 2025-12-04T12:53:07.9477588Z * [new branch] gh/ezyang/3203/orig -> origin/gh/ezyang/3203/orig 2025-12-04T12:53:07.9477657Z * [new branch] gh/ezyang/3204/base -> origin/gh/ezyang/3204/base 2025-12-04T12:53:07.9477732Z * [new branch] gh/ezyang/3204/head -> origin/gh/ezyang/3204/head 2025-12-04T12:53:07.9477801Z * [new branch] gh/ezyang/3204/orig -> origin/gh/ezyang/3204/orig 2025-12-04T12:53:07.9477874Z * [new branch] gh/ezyang/3205/base -> origin/gh/ezyang/3205/base 2025-12-04T12:53:07.9477943Z * [new branch] gh/ezyang/3205/head -> origin/gh/ezyang/3205/head 2025-12-04T12:53:07.9478011Z * [new branch] gh/ezyang/3205/orig -> origin/gh/ezyang/3205/orig 2025-12-04T12:53:07.9478084Z * [new branch] gh/ezyang/3206/base -> origin/gh/ezyang/3206/base 2025-12-04T12:53:07.9478154Z * [new branch] gh/ezyang/3206/head -> origin/gh/ezyang/3206/head 2025-12-04T12:53:07.9478223Z * [new branch] gh/ezyang/3206/orig -> origin/gh/ezyang/3206/orig 2025-12-04T12:53:07.9478294Z * [new branch] gh/ezyang/3207/base -> origin/gh/ezyang/3207/base 2025-12-04T12:53:07.9478364Z * [new branch] gh/ezyang/3207/head -> origin/gh/ezyang/3207/head 2025-12-04T12:53:07.9478431Z * [new branch] gh/ezyang/3207/orig -> origin/gh/ezyang/3207/orig 2025-12-04T12:53:07.9478505Z * [new branch] gh/ezyang/3208/base -> origin/gh/ezyang/3208/base 2025-12-04T12:53:07.9478576Z * [new branch] gh/ezyang/3208/head -> origin/gh/ezyang/3208/head 2025-12-04T12:53:07.9478644Z * [new branch] gh/ezyang/3208/orig -> origin/gh/ezyang/3208/orig 2025-12-04T12:53:07.9478717Z * [new branch] gh/ezyang/3209/base -> origin/gh/ezyang/3209/base 2025-12-04T12:53:07.9478791Z * [new branch] gh/ezyang/3209/head -> origin/gh/ezyang/3209/head 2025-12-04T12:53:07.9478863Z * [new branch] gh/ezyang/3209/orig -> origin/gh/ezyang/3209/orig 2025-12-04T12:53:07.9478941Z * [new branch] gh/fadara01/3/base -> origin/gh/fadara01/3/base 2025-12-04T12:53:07.9479044Z * [new branch] gh/fadara01/3/head -> origin/gh/fadara01/3/head 2025-12-04T12:53:07.9479118Z * [new branch] gh/fadara01/3/orig -> origin/gh/fadara01/3/orig 2025-12-04T12:53:07.9479190Z * [new branch] gh/fadara01/5/base -> origin/gh/fadara01/5/base 2025-12-04T12:53:07.9479261Z * [new branch] gh/fadara01/5/head -> origin/gh/fadara01/5/head 2025-12-04T12:53:07.9479333Z * [new branch] gh/fadara01/5/orig -> origin/gh/fadara01/5/orig 2025-12-04T12:53:07.9479407Z * [new branch] gh/fadara01/6/base -> origin/gh/fadara01/6/base 2025-12-04T12:53:07.9479477Z * [new branch] gh/fadara01/6/head -> origin/gh/fadara01/6/head 2025-12-04T12:53:07.9479549Z * [new branch] gh/fadara01/6/orig -> origin/gh/fadara01/6/orig 2025-12-04T12:53:07.9479619Z * [new branch] gh/fadara01/7/base -> origin/gh/fadara01/7/base 2025-12-04T12:53:07.9479690Z * [new branch] gh/fadara01/7/head -> origin/gh/fadara01/7/head 2025-12-04T12:53:07.9479762Z * [new branch] gh/fadara01/7/orig -> origin/gh/fadara01/7/orig 2025-12-04T12:53:07.9479831Z * [new branch] gh/fadara01/8/base -> origin/gh/fadara01/8/base 2025-12-04T12:53:07.9479931Z * [new branch] gh/fadara01/8/head -> origin/gh/fadara01/8/head 2025-12-04T12:53:07.9480001Z * [new branch] gh/fadara01/8/orig -> origin/gh/fadara01/8/orig 2025-12-04T12:53:07.9480068Z * [new branch] gh/fadara01/9/base -> origin/gh/fadara01/9/base 2025-12-04T12:53:07.9480135Z * [new branch] gh/fadara01/9/head -> origin/gh/fadara01/9/head 2025-12-04T12:53:07.9480240Z * [new branch] gh/fadara01/9/orig -> origin/gh/fadara01/9/orig 2025-12-04T12:53:07.9480315Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-12-04T12:53:07.9480387Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-12-04T12:53:07.9480459Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-12-04T12:53:07.9480530Z * [new branch] gh/fduwjj/211/base -> origin/gh/fduwjj/211/base 2025-12-04T12:53:07.9480602Z * [new branch] gh/fduwjj/211/head -> origin/gh/fduwjj/211/head 2025-12-04T12:53:07.9480677Z * [new branch] gh/fduwjj/211/orig -> origin/gh/fduwjj/211/orig 2025-12-04T12:53:07.9480747Z * [new branch] gh/fduwjj/212/base -> origin/gh/fduwjj/212/base 2025-12-04T12:53:07.9480817Z * [new branch] gh/fduwjj/212/head -> origin/gh/fduwjj/212/head 2025-12-04T12:53:07.9480892Z * [new branch] gh/fduwjj/212/orig -> origin/gh/fduwjj/212/orig 2025-12-04T12:53:07.9480963Z * [new branch] gh/fduwjj/213/base -> origin/gh/fduwjj/213/base 2025-12-04T12:53:07.9481034Z * [new branch] gh/fduwjj/213/head -> origin/gh/fduwjj/213/head 2025-12-04T12:53:07.9481109Z * [new branch] gh/fduwjj/213/orig -> origin/gh/fduwjj/213/orig 2025-12-04T12:53:07.9481179Z * [new branch] gh/fduwjj/226/base -> origin/gh/fduwjj/226/base 2025-12-04T12:53:07.9481248Z * [new branch] gh/fduwjj/226/head -> origin/gh/fduwjj/226/head 2025-12-04T12:53:07.9481318Z * [new branch] gh/fduwjj/226/orig -> origin/gh/fduwjj/226/orig 2025-12-04T12:53:07.9481385Z * [new branch] gh/fduwjj/229/base -> origin/gh/fduwjj/229/base 2025-12-04T12:53:07.9481452Z * [new branch] gh/fduwjj/229/head -> origin/gh/fduwjj/229/head 2025-12-04T12:53:07.9481521Z * [new branch] gh/fduwjj/229/orig -> origin/gh/fduwjj/229/orig 2025-12-04T12:53:07.9481590Z * [new branch] gh/fduwjj/233/base -> origin/gh/fduwjj/233/base 2025-12-04T12:53:07.9481712Z * [new branch] gh/fduwjj/233/head -> origin/gh/fduwjj/233/head 2025-12-04T12:53:07.9481785Z * [new branch] gh/fduwjj/233/orig -> origin/gh/fduwjj/233/orig 2025-12-04T12:53:07.9481852Z * [new branch] gh/fduwjj/234/base -> origin/gh/fduwjj/234/base 2025-12-04T12:53:07.9481921Z * [new branch] gh/fduwjj/234/head -> origin/gh/fduwjj/234/head 2025-12-04T12:53:07.9481990Z * [new branch] gh/fduwjj/234/orig -> origin/gh/fduwjj/234/orig 2025-12-04T12:53:07.9482059Z * [new branch] gh/fduwjj/235/base -> origin/gh/fduwjj/235/base 2025-12-04T12:53:07.9482129Z * [new branch] gh/fduwjj/235/head -> origin/gh/fduwjj/235/head 2025-12-04T12:53:07.9482196Z * [new branch] gh/fduwjj/235/orig -> origin/gh/fduwjj/235/orig 2025-12-04T12:53:07.9482262Z * [new branch] gh/fduwjj/236/base -> origin/gh/fduwjj/236/base 2025-12-04T12:53:07.9482339Z * [new branch] gh/fduwjj/236/head -> origin/gh/fduwjj/236/head 2025-12-04T12:53:07.9482406Z * [new branch] gh/fduwjj/236/orig -> origin/gh/fduwjj/236/orig 2025-12-04T12:53:07.9482473Z * [new branch] gh/fduwjj/237/base -> origin/gh/fduwjj/237/base 2025-12-04T12:53:07.9482540Z * [new branch] gh/fduwjj/237/head -> origin/gh/fduwjj/237/head 2025-12-04T12:53:07.9482659Z * [new branch] gh/fduwjj/237/orig -> origin/gh/fduwjj/237/orig 2025-12-04T12:53:07.9482726Z * [new branch] gh/fduwjj/238/base -> origin/gh/fduwjj/238/base 2025-12-04T12:53:07.9482794Z * [new branch] gh/fduwjj/238/head -> origin/gh/fduwjj/238/head 2025-12-04T12:53:07.9482863Z * [new branch] gh/fduwjj/238/orig -> origin/gh/fduwjj/238/orig 2025-12-04T12:53:07.9482932Z * [new branch] gh/fduwjj/239/base -> origin/gh/fduwjj/239/base 2025-12-04T12:53:07.9483004Z * [new branch] gh/fduwjj/239/head -> origin/gh/fduwjj/239/head 2025-12-04T12:53:07.9483071Z * [new branch] gh/fduwjj/239/orig -> origin/gh/fduwjj/239/orig 2025-12-04T12:53:07.9483141Z * [new branch] gh/fegin/332/base -> origin/gh/fegin/332/base 2025-12-04T12:53:07.9483213Z * [new branch] gh/fegin/332/head -> origin/gh/fegin/332/head 2025-12-04T12:53:07.9483280Z * [new branch] gh/fegin/332/orig -> origin/gh/fegin/332/orig 2025-12-04T12:53:07.9483347Z * [new branch] gh/fegin/333/base -> origin/gh/fegin/333/base 2025-12-04T12:53:07.9483414Z * [new branch] gh/fegin/333/head -> origin/gh/fegin/333/head 2025-12-04T12:53:07.9483480Z * [new branch] gh/fegin/333/orig -> origin/gh/fegin/333/orig 2025-12-04T12:53:07.9483545Z * [new branch] gh/fegin/334/base -> origin/gh/fegin/334/base 2025-12-04T12:53:07.9483613Z * [new branch] gh/fegin/334/head -> origin/gh/fegin/334/head 2025-12-04T12:53:07.9483679Z * [new branch] gh/fegin/334/orig -> origin/gh/fegin/334/orig 2025-12-04T12:53:07.9483747Z * [new branch] gh/fegin/335/base -> origin/gh/fegin/335/base 2025-12-04T12:53:07.9483812Z * [new branch] gh/fegin/335/head -> origin/gh/fegin/335/head 2025-12-04T12:53:07.9483879Z * [new branch] gh/fegin/335/orig -> origin/gh/fegin/335/orig 2025-12-04T12:53:07.9483949Z * [new branch] gh/fffrog/160/base -> origin/gh/fffrog/160/base 2025-12-04T12:53:07.9484017Z * [new branch] gh/fffrog/160/head -> origin/gh/fffrog/160/head 2025-12-04T12:53:07.9484083Z * [new branch] gh/fffrog/177/base -> origin/gh/fffrog/177/base 2025-12-04T12:53:07.9484151Z * [new branch] gh/fffrog/177/head -> origin/gh/fffrog/177/head 2025-12-04T12:53:07.9484256Z * [new branch] gh/fffrog/177/orig -> origin/gh/fffrog/177/orig 2025-12-04T12:53:07.9484326Z * [new branch] gh/fffrog/178/base -> origin/gh/fffrog/178/base 2025-12-04T12:53:07.9484396Z * [new branch] gh/fffrog/178/head -> origin/gh/fffrog/178/head 2025-12-04T12:53:07.9484461Z * [new branch] gh/fffrog/178/orig -> origin/gh/fffrog/178/orig 2025-12-04T12:53:07.9484528Z * [new branch] gh/fffrog/181/base -> origin/gh/fffrog/181/base 2025-12-04T12:53:07.9484596Z * [new branch] gh/fffrog/181/head -> origin/gh/fffrog/181/head 2025-12-04T12:53:07.9484662Z * [new branch] gh/fffrog/181/orig -> origin/gh/fffrog/181/orig 2025-12-04T12:53:07.9484728Z * [new branch] gh/fffrog/183/base -> origin/gh/fffrog/183/base 2025-12-04T12:53:07.9484797Z * [new branch] gh/fffrog/183/head -> origin/gh/fffrog/183/head 2025-12-04T12:53:07.9484865Z * [new branch] gh/fffrog/183/orig -> origin/gh/fffrog/183/orig 2025-12-04T12:53:07.9484933Z * [new branch] gh/fxdawnn/10/base -> origin/gh/fxdawnn/10/base 2025-12-04T12:53:07.9485002Z * [new branch] gh/fxdawnn/10/head -> origin/gh/fxdawnn/10/head 2025-12-04T12:53:07.9485069Z * [new branch] gh/fxdawnn/10/orig -> origin/gh/fxdawnn/10/orig 2025-12-04T12:53:07.9485166Z * [new branch] gh/fxdawnn/11/base -> origin/gh/fxdawnn/11/base 2025-12-04T12:53:07.9485236Z * [new branch] gh/fxdawnn/11/head -> origin/gh/fxdawnn/11/head 2025-12-04T12:53:07.9485304Z * [new branch] gh/fxdawnn/11/orig -> origin/gh/fxdawnn/11/orig 2025-12-04T12:53:07.9485373Z * [new branch] gh/fxdawnn/12/base -> origin/gh/fxdawnn/12/base 2025-12-04T12:53:07.9485440Z * [new branch] gh/fxdawnn/12/head -> origin/gh/fxdawnn/12/head 2025-12-04T12:53:07.9485508Z * [new branch] gh/fxdawnn/12/orig -> origin/gh/fxdawnn/12/orig 2025-12-04T12:53:07.9485577Z * [new branch] gh/fxdawnn/13/base -> origin/gh/fxdawnn/13/base 2025-12-04T12:53:07.9485644Z * [new branch] gh/fxdawnn/13/head -> origin/gh/fxdawnn/13/head 2025-12-04T12:53:07.9485712Z * [new branch] gh/fxdawnn/13/orig -> origin/gh/fxdawnn/13/orig 2025-12-04T12:53:07.9485782Z * [new branch] gh/fxdawnn/14/base -> origin/gh/fxdawnn/14/base 2025-12-04T12:53:07.9485848Z * [new branch] gh/fxdawnn/14/head -> origin/gh/fxdawnn/14/head 2025-12-04T12:53:07.9485916Z * [new branch] gh/fxdawnn/14/orig -> origin/gh/fxdawnn/14/orig 2025-12-04T12:53:07.9485986Z * [new branch] gh/fxdawnn/15/base -> origin/gh/fxdawnn/15/base 2025-12-04T12:53:07.9486054Z * [new branch] gh/fxdawnn/15/head -> origin/gh/fxdawnn/15/head 2025-12-04T12:53:07.9486121Z * [new branch] gh/fxdawnn/15/orig -> origin/gh/fxdawnn/15/orig 2025-12-04T12:53:07.9486193Z * [new branch] gh/fxdawnn/6/base -> origin/gh/fxdawnn/6/base 2025-12-04T12:53:07.9486260Z * [new branch] gh/fxdawnn/6/head -> origin/gh/fxdawnn/6/head 2025-12-04T12:53:07.9486327Z * [new branch] gh/fxdawnn/6/orig -> origin/gh/fxdawnn/6/orig 2025-12-04T12:53:07.9486396Z * [new branch] gh/fxdawnn/7/base -> origin/gh/fxdawnn/7/base 2025-12-04T12:53:07.9486463Z * [new branch] gh/fxdawnn/7/head -> origin/gh/fxdawnn/7/head 2025-12-04T12:53:07.9486529Z * [new branch] gh/fxdawnn/7/orig -> origin/gh/fxdawnn/7/orig 2025-12-04T12:53:07.9486596Z * [new branch] gh/fxdawnn/9/base -> origin/gh/fxdawnn/9/base 2025-12-04T12:53:07.9486661Z * [new branch] gh/fxdawnn/9/head -> origin/gh/fxdawnn/9/head 2025-12-04T12:53:07.9486727Z * [new branch] gh/fxdawnn/9/orig -> origin/gh/fxdawnn/9/orig 2025-12-04T12:53:07.9486825Z * [new branch] gh/galv/1/base -> origin/gh/galv/1/base 2025-12-04T12:53:07.9486890Z * [new branch] gh/galv/1/head -> origin/gh/galv/1/head 2025-12-04T12:53:07.9486954Z * [new branch] gh/galv/1/orig -> origin/gh/galv/1/orig 2025-12-04T12:53:07.9487020Z * [new branch] gh/galv/2/base -> origin/gh/galv/2/base 2025-12-04T12:53:07.9487083Z * [new branch] gh/galv/2/head -> origin/gh/galv/2/head 2025-12-04T12:53:07.9487147Z * [new branch] gh/galv/2/orig -> origin/gh/galv/2/orig 2025-12-04T12:53:07.9487210Z * [new branch] gh/galv/3/base -> origin/gh/galv/3/base 2025-12-04T12:53:07.9487272Z * [new branch] gh/galv/3/head -> origin/gh/galv/3/head 2025-12-04T12:53:07.9487335Z * [new branch] gh/galv/3/orig -> origin/gh/galv/3/orig 2025-12-04T12:53:07.9487413Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-12-04T12:53:07.9487487Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-12-04T12:53:07.9487559Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-12-04T12:53:07.9487657Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-12-04T12:53:07.9487730Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-12-04T12:53:07.9487802Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-12-04T12:53:07.9487873Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-12-04T12:53:07.9487942Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-12-04T12:53:07.9488014Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-12-04T12:53:07.9488085Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-12-04T12:53:07.9488155Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-12-04T12:53:07.9488226Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-12-04T12:53:07.9488297Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-12-04T12:53:07.9488366Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-12-04T12:53:07.9488438Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-12-04T12:53:07.9488509Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-12-04T12:53:07.9488583Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-12-04T12:53:07.9488656Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-12-04T12:53:07.9488728Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-12-04T12:53:07.9488799Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-12-04T12:53:07.9488868Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-12-04T12:53:07.9488938Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-12-04T12:53:07.9489009Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-12-04T12:53:07.9489078Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-12-04T12:53:07.9489148Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-12-04T12:53:07.9489218Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-12-04T12:53:07.9489287Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-12-04T12:53:07.9489437Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-12-04T12:53:07.9489509Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-12-04T12:53:07.9489579Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-12-04T12:53:07.9489651Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-12-04T12:53:07.9489723Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-12-04T12:53:07.9489793Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-12-04T12:53:07.9489864Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-12-04T12:53:07.9489935Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-12-04T12:53:07.9490005Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-12-04T12:53:07.9490076Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-12-04T12:53:07.9490147Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-12-04T12:53:07.9490262Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-12-04T12:53:07.9490381Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-12-04T12:53:07.9490452Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-12-04T12:53:07.9490521Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-12-04T12:53:07.9490592Z * [new branch] gh/guangyey/208/base -> origin/gh/guangyey/208/base 2025-12-04T12:53:07.9490661Z * [new branch] gh/guangyey/208/head -> origin/gh/guangyey/208/head 2025-12-04T12:53:07.9490731Z * [new branch] gh/guangyey/208/orig -> origin/gh/guangyey/208/orig 2025-12-04T12:53:07.9490803Z * [new branch] gh/guangyey/228/base -> origin/gh/guangyey/228/base 2025-12-04T12:53:07.9490871Z * [new branch] gh/guangyey/228/head -> origin/gh/guangyey/228/head 2025-12-04T12:53:07.9490941Z * [new branch] gh/guangyey/228/orig -> origin/gh/guangyey/228/orig 2025-12-04T12:53:07.9491013Z * [new branch] gh/guangyey/230/base -> origin/gh/guangyey/230/base 2025-12-04T12:53:07.9491083Z * [new branch] gh/guangyey/230/head -> origin/gh/guangyey/230/head 2025-12-04T12:53:07.9491152Z * [new branch] gh/guangyey/230/orig -> origin/gh/guangyey/230/orig 2025-12-04T12:53:07.9491222Z * [new branch] gh/guangyey/231/base -> origin/gh/guangyey/231/base 2025-12-04T12:53:07.9491291Z * [new branch] gh/guangyey/231/head -> origin/gh/guangyey/231/head 2025-12-04T12:53:07.9491360Z * [new branch] gh/guangyey/231/orig -> origin/gh/guangyey/231/orig 2025-12-04T12:53:07.9491434Z * [new branch] gh/guangyey/232/base -> origin/gh/guangyey/232/base 2025-12-04T12:53:07.9491504Z * [new branch] gh/guangyey/232/head -> origin/gh/guangyey/232/head 2025-12-04T12:53:07.9491573Z * [new branch] gh/guangyey/232/orig -> origin/gh/guangyey/232/orig 2025-12-04T12:53:07.9491646Z * [new branch] gh/guangyey/233/base -> origin/gh/guangyey/233/base 2025-12-04T12:53:07.9491716Z * [new branch] gh/guangyey/233/head -> origin/gh/guangyey/233/head 2025-12-04T12:53:07.9491787Z * [new branch] gh/guangyey/233/orig -> origin/gh/guangyey/233/orig 2025-12-04T12:53:07.9491857Z * [new branch] gh/guangyey/234/base -> origin/gh/guangyey/234/base 2025-12-04T12:53:07.9491927Z * [new branch] gh/guangyey/234/head -> origin/gh/guangyey/234/head 2025-12-04T12:53:07.9492046Z * [new branch] gh/guangyey/234/orig -> origin/gh/guangyey/234/orig 2025-12-04T12:53:07.9492115Z * [new branch] gh/guangyey/235/base -> origin/gh/guangyey/235/base 2025-12-04T12:53:07.9492184Z * [new branch] gh/guangyey/235/head -> origin/gh/guangyey/235/head 2025-12-04T12:53:07.9492256Z * [new branch] gh/guangyey/235/orig -> origin/gh/guangyey/235/orig 2025-12-04T12:53:07.9492327Z * [new branch] gh/guangyey/236/base -> origin/gh/guangyey/236/base 2025-12-04T12:53:07.9492397Z * [new branch] gh/guangyey/236/head -> origin/gh/guangyey/236/head 2025-12-04T12:53:07.9492468Z * [new branch] gh/guangyey/236/orig -> origin/gh/guangyey/236/orig 2025-12-04T12:53:07.9492537Z * [new branch] gh/guangyey/237/base -> origin/gh/guangyey/237/base 2025-12-04T12:53:07.9492606Z * [new branch] gh/guangyey/237/head -> origin/gh/guangyey/237/head 2025-12-04T12:53:07.9492679Z * [new branch] gh/guangyey/237/orig -> origin/gh/guangyey/237/orig 2025-12-04T12:53:07.9492749Z * [new branch] gh/guangyey/238/base -> origin/gh/guangyey/238/base 2025-12-04T12:53:07.9492820Z * [new branch] gh/guangyey/238/head -> origin/gh/guangyey/238/head 2025-12-04T12:53:07.9492892Z * [new branch] gh/guangyey/239/base -> origin/gh/guangyey/239/base 2025-12-04T12:53:07.9493010Z * [new branch] gh/guangyey/239/head -> origin/gh/guangyey/239/head 2025-12-04T12:53:07.9493082Z * [new branch] gh/guangyey/239/orig -> origin/gh/guangyey/239/orig 2025-12-04T12:53:07.9493153Z * [new branch] gh/guangyey/240/base -> origin/gh/guangyey/240/base 2025-12-04T12:53:07.9493222Z * [new branch] gh/guangyey/240/head -> origin/gh/guangyey/240/head 2025-12-04T12:53:07.9493293Z * [new branch] gh/guangyey/240/orig -> origin/gh/guangyey/240/orig 2025-12-04T12:53:07.9493365Z * [new branch] gh/guangyey/241/base -> origin/gh/guangyey/241/base 2025-12-04T12:53:07.9493435Z * [new branch] gh/guangyey/241/head -> origin/gh/guangyey/241/head 2025-12-04T12:53:07.9493505Z * [new branch] gh/guangyey/241/orig -> origin/gh/guangyey/241/orig 2025-12-04T12:53:07.9493574Z * [new branch] gh/guangyey/242/base -> origin/gh/guangyey/242/base 2025-12-04T12:53:07.9493646Z * [new branch] gh/guangyey/242/head -> origin/gh/guangyey/242/head 2025-12-04T12:53:07.9493717Z * [new branch] gh/guangyey/242/orig -> origin/gh/guangyey/242/orig 2025-12-04T12:53:07.9493787Z * [new branch] gh/guangyey/243/base -> origin/gh/guangyey/243/base 2025-12-04T12:53:07.9493856Z * [new branch] gh/guangyey/243/head -> origin/gh/guangyey/243/head 2025-12-04T12:53:07.9493927Z * [new branch] gh/guangyey/243/orig -> origin/gh/guangyey/243/orig 2025-12-04T12:53:07.9493998Z * [new branch] gh/guangyey/244/base -> origin/gh/guangyey/244/base 2025-12-04T12:53:07.9494067Z * [new branch] gh/guangyey/244/head -> origin/gh/guangyey/244/head 2025-12-04T12:53:07.9494139Z * [new branch] gh/guangyey/244/orig -> origin/gh/guangyey/244/orig 2025-12-04T12:53:07.9494208Z * [new branch] gh/guangyey/245/base -> origin/gh/guangyey/245/base 2025-12-04T12:53:07.9494279Z * [new branch] gh/guangyey/245/head -> origin/gh/guangyey/245/head 2025-12-04T12:53:07.9494350Z * [new branch] gh/guangyey/245/orig -> origin/gh/guangyey/245/orig 2025-12-04T12:53:07.9494420Z * [new branch] gh/guangyey/246/base -> origin/gh/guangyey/246/base 2025-12-04T12:53:07.9494489Z * [new branch] gh/guangyey/246/head -> origin/gh/guangyey/246/head 2025-12-04T12:53:07.9494560Z * [new branch] gh/guangyey/246/orig -> origin/gh/guangyey/246/orig 2025-12-04T12:53:07.9494657Z * [new branch] gh/guangyey/247/base -> origin/gh/guangyey/247/base 2025-12-04T12:53:07.9494727Z * [new branch] gh/guangyey/247/head -> origin/gh/guangyey/247/head 2025-12-04T12:53:07.9494797Z * [new branch] gh/guangyey/247/orig -> origin/gh/guangyey/247/orig 2025-12-04T12:53:07.9494868Z * [new branch] gh/guangyey/248/base -> origin/gh/guangyey/248/base 2025-12-04T12:53:07.9494940Z * [new branch] gh/guangyey/248/head -> origin/gh/guangyey/248/head 2025-12-04T12:53:07.9495009Z * [new branch] gh/guangyey/248/orig -> origin/gh/guangyey/248/orig 2025-12-04T12:53:07.9495079Z * [new branch] gh/guangyey/249/base -> origin/gh/guangyey/249/base 2025-12-04T12:53:07.9495150Z * [new branch] gh/guangyey/249/head -> origin/gh/guangyey/249/head 2025-12-04T12:53:07.9495219Z * [new branch] gh/guangyey/249/orig -> origin/gh/guangyey/249/orig 2025-12-04T12:53:07.9495290Z * [new branch] gh/guangyey/250/base -> origin/gh/guangyey/250/base 2025-12-04T12:53:07.9495360Z * [new branch] gh/guangyey/250/head -> origin/gh/guangyey/250/head 2025-12-04T12:53:07.9495429Z * [new branch] gh/guangyey/250/orig -> origin/gh/guangyey/250/orig 2025-12-04T12:53:07.9495528Z * [new branch] gh/guangyey/251/base -> origin/gh/guangyey/251/base 2025-12-04T12:53:07.9495600Z * [new branch] gh/guangyey/251/head -> origin/gh/guangyey/251/head 2025-12-04T12:53:07.9495669Z * [new branch] gh/guangyey/251/orig -> origin/gh/guangyey/251/orig 2025-12-04T12:53:07.9495739Z * [new branch] gh/guangyey/252/base -> origin/gh/guangyey/252/base 2025-12-04T12:53:07.9495809Z * [new branch] gh/guangyey/252/head -> origin/gh/guangyey/252/head 2025-12-04T12:53:07.9495878Z * [new branch] gh/guangyey/252/orig -> origin/gh/guangyey/252/orig 2025-12-04T12:53:07.9495950Z * [new branch] gh/guangyey/253/base -> origin/gh/guangyey/253/base 2025-12-04T12:53:07.9496021Z * [new branch] gh/guangyey/253/head -> origin/gh/guangyey/253/head 2025-12-04T12:53:07.9496091Z * [new branch] gh/guangyey/253/orig -> origin/gh/guangyey/253/orig 2025-12-04T12:53:07.9496161Z * [new branch] gh/guangyey/254/base -> origin/gh/guangyey/254/base 2025-12-04T12:53:07.9496232Z * [new branch] gh/guangyey/254/head -> origin/gh/guangyey/254/head 2025-12-04T12:53:07.9496301Z * [new branch] gh/guangyey/254/orig -> origin/gh/guangyey/254/orig 2025-12-04T12:53:07.9496371Z * [new branch] gh/guangyey/255/base -> origin/gh/guangyey/255/base 2025-12-04T12:53:07.9496440Z * [new branch] gh/guangyey/255/head -> origin/gh/guangyey/255/head 2025-12-04T12:53:07.9496511Z * [new branch] gh/guangyey/255/orig -> origin/gh/guangyey/255/orig 2025-12-04T12:53:07.9496584Z * [new branch] gh/guangyey/256/base -> origin/gh/guangyey/256/base 2025-12-04T12:53:07.9496653Z * [new branch] gh/guangyey/256/head -> origin/gh/guangyey/256/head 2025-12-04T12:53:07.9496722Z * [new branch] gh/guangyey/256/orig -> origin/gh/guangyey/256/orig 2025-12-04T12:53:07.9496822Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-12-04T12:53:07.9496913Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-12-04T12:53:07.9497001Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-12-04T12:53:07.9497089Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-12-04T12:53:07.9497175Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-12-04T12:53:07.9497291Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-12-04T12:53:07.9497380Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-12-04T12:53:07.9497467Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-12-04T12:53:07.9497557Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-12-04T12:53:07.9497645Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-12-04T12:53:07.9497732Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-12-04T12:53:07.9497822Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-12-04T12:53:07.9497908Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-12-04T12:53:07.9497997Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-12-04T12:53:07.9498086Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-12-04T12:53:07.9498174Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-12-04T12:53:07.9498291Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-12-04T12:53:07.9498381Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-12-04T12:53:07.9498468Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-12-04T12:53:07.9498555Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-12-04T12:53:07.9498644Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-12-04T12:53:07.9498735Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-12-04T12:53:07.9498821Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-12-04T12:53:07.9498910Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-12-04T12:53:07.9498998Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-12-04T12:53:07.9499086Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-12-04T12:53:07.9499177Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-12-04T12:53:07.9499263Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-12-04T12:53:07.9499351Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-12-04T12:53:07.9499439Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-12-04T12:53:07.9499527Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-12-04T12:53:07.9499616Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-12-04T12:53:07.9499702Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-12-04T12:53:07.9499791Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-12-04T12:53:07.9499880Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-12-04T12:53:07.9499969Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-12-04T12:53:07.9500057Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-12-04T12:53:07.9500148Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-12-04T12:53:07.9500375Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-12-04T12:53:07.9500462Z * [new branch] gh/guilhermeleobas/247/base -> origin/gh/guilhermeleobas/247/base 2025-12-04T12:53:07.9500549Z * [new branch] gh/guilhermeleobas/247/head -> origin/gh/guilhermeleobas/247/head 2025-12-04T12:53:07.9500638Z * [new branch] gh/guilhermeleobas/247/orig -> origin/gh/guilhermeleobas/247/orig 2025-12-04T12:53:07.9500727Z * [new branch] gh/guilhermeleobas/248/base -> origin/gh/guilhermeleobas/248/base 2025-12-04T12:53:07.9500813Z * [new branch] gh/guilhermeleobas/248/head -> origin/gh/guilhermeleobas/248/head 2025-12-04T12:53:07.9500903Z * [new branch] gh/guilhermeleobas/248/orig -> origin/gh/guilhermeleobas/248/orig 2025-12-04T12:53:07.9500989Z * [new branch] gh/guilhermeleobas/250/base -> origin/gh/guilhermeleobas/250/base 2025-12-04T12:53:07.9501078Z * [new branch] gh/guilhermeleobas/250/head -> origin/gh/guilhermeleobas/250/head 2025-12-04T12:53:07.9501166Z * [new branch] gh/guilhermeleobas/250/orig -> origin/gh/guilhermeleobas/250/orig 2025-12-04T12:53:07.9501252Z * [new branch] gh/guilhermeleobas/253/base -> origin/gh/guilhermeleobas/253/base 2025-12-04T12:53:07.9501382Z * [new branch] gh/guilhermeleobas/253/head -> origin/gh/guilhermeleobas/253/head 2025-12-04T12:53:07.9501468Z * [new branch] gh/guilhermeleobas/253/orig -> origin/gh/guilhermeleobas/253/orig 2025-12-04T12:53:07.9501555Z * [new branch] gh/guilhermeleobas/254/base -> origin/gh/guilhermeleobas/254/base 2025-12-04T12:53:07.9501642Z * [new branch] gh/guilhermeleobas/254/head -> origin/gh/guilhermeleobas/254/head 2025-12-04T12:53:07.9501728Z * [new branch] gh/guilhermeleobas/254/orig -> origin/gh/guilhermeleobas/254/orig 2025-12-04T12:53:07.9501817Z * [new branch] gh/guilhermeleobas/255/base -> origin/gh/guilhermeleobas/255/base 2025-12-04T12:53:07.9501905Z * [new branch] gh/guilhermeleobas/255/head -> origin/gh/guilhermeleobas/255/head 2025-12-04T12:53:07.9501991Z * [new branch] gh/guilhermeleobas/255/orig -> origin/gh/guilhermeleobas/255/orig 2025-12-04T12:53:07.9502078Z * [new branch] gh/guilhermeleobas/256/base -> origin/gh/guilhermeleobas/256/base 2025-12-04T12:53:07.9502166Z * [new branch] gh/guilhermeleobas/256/head -> origin/gh/guilhermeleobas/256/head 2025-12-04T12:53:07.9502253Z * [new branch] gh/guilhermeleobas/256/orig -> origin/gh/guilhermeleobas/256/orig 2025-12-04T12:53:07.9502340Z * [new branch] gh/guilhermeleobas/257/base -> origin/gh/guilhermeleobas/257/base 2025-12-04T12:53:07.9502427Z * [new branch] gh/guilhermeleobas/257/head -> origin/gh/guilhermeleobas/257/head 2025-12-04T12:53:07.9502515Z * [new branch] gh/guilhermeleobas/257/orig -> origin/gh/guilhermeleobas/257/orig 2025-12-04T12:53:07.9502601Z * [new branch] gh/guilhermeleobas/258/base -> origin/gh/guilhermeleobas/258/base 2025-12-04T12:53:07.9502690Z * [new branch] gh/guilhermeleobas/258/head -> origin/gh/guilhermeleobas/258/head 2025-12-04T12:53:07.9502777Z * [new branch] gh/guilhermeleobas/258/orig -> origin/gh/guilhermeleobas/258/orig 2025-12-04T12:53:07.9502865Z * [new branch] gh/guilhermeleobas/259/base -> origin/gh/guilhermeleobas/259/base 2025-12-04T12:53:07.9502951Z * [new branch] gh/guilhermeleobas/259/head -> origin/gh/guilhermeleobas/259/head 2025-12-04T12:53:07.9503037Z * [new branch] gh/guilhermeleobas/259/orig -> origin/gh/guilhermeleobas/259/orig 2025-12-04T12:53:07.9503125Z * [new branch] gh/guilhermeleobas/260/base -> origin/gh/guilhermeleobas/260/base 2025-12-04T12:53:07.9503261Z * [new branch] gh/guilhermeleobas/260/head -> origin/gh/guilhermeleobas/260/head 2025-12-04T12:53:07.9503348Z * [new branch] gh/guilhermeleobas/260/orig -> origin/gh/guilhermeleobas/260/orig 2025-12-04T12:53:07.9503436Z * [new branch] gh/guilhermeleobas/261/base -> origin/gh/guilhermeleobas/261/base 2025-12-04T12:53:07.9503524Z * [new branch] gh/guilhermeleobas/261/head -> origin/gh/guilhermeleobas/261/head 2025-12-04T12:53:07.9503610Z * [new branch] gh/guilhermeleobas/261/orig -> origin/gh/guilhermeleobas/261/orig 2025-12-04T12:53:07.9503698Z * [new branch] gh/guilhermeleobas/262/base -> origin/gh/guilhermeleobas/262/base 2025-12-04T12:53:07.9503785Z * [new branch] gh/guilhermeleobas/262/head -> origin/gh/guilhermeleobas/262/head 2025-12-04T12:53:07.9503872Z * [new branch] gh/guilhermeleobas/262/orig -> origin/gh/guilhermeleobas/262/orig 2025-12-04T12:53:07.9503962Z * [new branch] gh/guilhermeleobas/263/base -> origin/gh/guilhermeleobas/263/base 2025-12-04T12:53:07.9504048Z * [new branch] gh/guilhermeleobas/263/head -> origin/gh/guilhermeleobas/263/head 2025-12-04T12:53:07.9504134Z * [new branch] gh/guilhermeleobas/263/orig -> origin/gh/guilhermeleobas/263/orig 2025-12-04T12:53:07.9504250Z * [new branch] gh/guilhermeleobas/264/base -> origin/gh/guilhermeleobas/264/base 2025-12-04T12:53:07.9504337Z * [new branch] gh/guilhermeleobas/264/head -> origin/gh/guilhermeleobas/264/head 2025-12-04T12:53:07.9504425Z * [new branch] gh/guilhermeleobas/264/orig -> origin/gh/guilhermeleobas/264/orig 2025-12-04T12:53:07.9504512Z * [new branch] gh/guilhermeleobas/265/base -> origin/gh/guilhermeleobas/265/base 2025-12-04T12:53:07.9504598Z * [new branch] gh/guilhermeleobas/265/head -> origin/gh/guilhermeleobas/265/head 2025-12-04T12:53:07.9504688Z * [new branch] gh/guilhermeleobas/265/orig -> origin/gh/guilhermeleobas/265/orig 2025-12-04T12:53:07.9504774Z * [new branch] gh/guilhermeleobas/266/base -> origin/gh/guilhermeleobas/266/base 2025-12-04T12:53:07.9504861Z * [new branch] gh/guilhermeleobas/266/head -> origin/gh/guilhermeleobas/266/head 2025-12-04T12:53:07.9504950Z * [new branch] gh/guilhermeleobas/266/orig -> origin/gh/guilhermeleobas/266/orig 2025-12-04T12:53:07.9505037Z * [new branch] gh/guilhermeleobas/267/base -> origin/gh/guilhermeleobas/267/base 2025-12-04T12:53:07.9505123Z * [new branch] gh/guilhermeleobas/267/head -> origin/gh/guilhermeleobas/267/head 2025-12-04T12:53:07.9505211Z * [new branch] gh/guilhermeleobas/267/orig -> origin/gh/guilhermeleobas/267/orig 2025-12-04T12:53:07.9505291Z * [new branch] gh/hameerabbasi/1/base -> origin/gh/hameerabbasi/1/base 2025-12-04T12:53:07.9505369Z * [new branch] gh/hameerabbasi/1/head -> origin/gh/hameerabbasi/1/head 2025-12-04T12:53:07.9505448Z * [new branch] gh/hameerabbasi/2/base -> origin/gh/hameerabbasi/2/base 2025-12-04T12:53:07.9505523Z * [new branch] gh/hameerabbasi/2/head -> origin/gh/hameerabbasi/2/head 2025-12-04T12:53:07.9505599Z * [new branch] gh/hameerabbasi/2/orig -> origin/gh/hameerabbasi/2/orig 2025-12-04T12:53:07.9505675Z * [new branch] gh/hameerabbasi/3/base -> origin/gh/hameerabbasi/3/base 2025-12-04T12:53:07.9505749Z * [new branch] gh/hameerabbasi/3/head -> origin/gh/hameerabbasi/3/head 2025-12-04T12:53:07.9505825Z * [new branch] gh/hameerabbasi/3/orig -> origin/gh/hameerabbasi/3/orig 2025-12-04T12:53:07.9505898Z * [new branch] gh/hameerabbasi/4/base -> origin/gh/hameerabbasi/4/base 2025-12-04T12:53:07.9505971Z * [new branch] gh/hameerabbasi/4/head -> origin/gh/hameerabbasi/4/head 2025-12-04T12:53:07.9506081Z * [new branch] gh/hameerabbasi/4/orig -> origin/gh/hameerabbasi/4/orig 2025-12-04T12:53:07.9506151Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-12-04T12:53:07.9506219Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-12-04T12:53:07.9506287Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-12-04T12:53:07.9506355Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-12-04T12:53:07.9506421Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-12-04T12:53:07.9506488Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-12-04T12:53:07.9506554Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-12-04T12:53:07.9506619Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-12-04T12:53:07.9506691Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-12-04T12:53:07.9506762Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-12-04T12:53:07.9506829Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-12-04T12:53:07.9506898Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-12-04T12:53:07.9506995Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-12-04T12:53:07.9507062Z * [new branch] gh/isuruf/158/base -> origin/gh/isuruf/158/base 2025-12-04T12:53:07.9507130Z * [new branch] gh/isuruf/158/head -> origin/gh/isuruf/158/head 2025-12-04T12:53:07.9507196Z * [new branch] gh/isuruf/159/base -> origin/gh/isuruf/159/base 2025-12-04T12:53:07.9507262Z * [new branch] gh/isuruf/159/head -> origin/gh/isuruf/159/head 2025-12-04T12:53:07.9507330Z * [new branch] gh/isuruf/160/base -> origin/gh/isuruf/160/base 2025-12-04T12:53:07.9507398Z * [new branch] gh/isuruf/160/head -> origin/gh/isuruf/160/head 2025-12-04T12:53:07.9507467Z * [new branch] gh/isuruf/160/orig -> origin/gh/isuruf/160/orig 2025-12-04T12:53:07.9507534Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-12-04T12:53:07.9507602Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-12-04T12:53:07.9507670Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-12-04T12:53:07.9507743Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-12-04T12:53:07.9507814Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-12-04T12:53:07.9507887Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-12-04T12:53:07.9507958Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-12-04T12:53:07.9508030Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-12-04T12:53:07.9508100Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-12-04T12:53:07.9508169Z * [new branch] gh/jamesjwu/196/base -> origin/gh/jamesjwu/196/base 2025-12-04T12:53:07.9508240Z * [new branch] gh/jamesjwu/196/head -> origin/gh/jamesjwu/196/head 2025-12-04T12:53:07.9508310Z * [new branch] gh/jamesjwu/196/orig -> origin/gh/jamesjwu/196/orig 2025-12-04T12:53:07.9508379Z * [new branch] gh/jamesjwu/198/base -> origin/gh/jamesjwu/198/base 2025-12-04T12:53:07.9508448Z * [new branch] gh/jamesjwu/198/head -> origin/gh/jamesjwu/198/head 2025-12-04T12:53:07.9508518Z * [new branch] gh/jamesjwu/198/orig -> origin/gh/jamesjwu/198/orig 2025-12-04T12:53:07.9508587Z * [new branch] gh/jamesjwu/207/base -> origin/gh/jamesjwu/207/base 2025-12-04T12:53:07.9508692Z * [new branch] gh/jamesjwu/207/head -> origin/gh/jamesjwu/207/head 2025-12-04T12:53:07.9508763Z * [new branch] gh/jamesjwu/207/orig -> origin/gh/jamesjwu/207/orig 2025-12-04T12:53:07.9508832Z * [new branch] gh/jamesjwu/208/base -> origin/gh/jamesjwu/208/base 2025-12-04T12:53:07.9508902Z * [new branch] gh/jamesjwu/208/head -> origin/gh/jamesjwu/208/head 2025-12-04T12:53:07.9508973Z * [new branch] gh/jamesjwu/208/orig -> origin/gh/jamesjwu/208/orig 2025-12-04T12:53:07.9509045Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-12-04T12:53:07.9509117Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-12-04T12:53:07.9509187Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-12-04T12:53:07.9509256Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-12-04T12:53:07.9509328Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-12-04T12:53:07.9509396Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-12-04T12:53:07.9509466Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-12-04T12:53:07.9509565Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-12-04T12:53:07.9509634Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-12-04T12:53:07.9509702Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-12-04T12:53:07.9509773Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-12-04T12:53:07.9509841Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-12-04T12:53:07.9509909Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-12-04T12:53:07.9509980Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-12-04T12:53:07.9510048Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-12-04T12:53:07.9510116Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-12-04T12:53:07.9510238Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-12-04T12:53:07.9510308Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-12-04T12:53:07.9510377Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-12-04T12:53:07.9510447Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-12-04T12:53:07.9510516Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-12-04T12:53:07.9510584Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-12-04T12:53:07.9510654Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-12-04T12:53:07.9510722Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-12-04T12:53:07.9510791Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-12-04T12:53:07.9510861Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-12-04T12:53:07.9510929Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-12-04T12:53:07.9510999Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-12-04T12:53:07.9511069Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-12-04T12:53:07.9511138Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-12-04T12:53:07.9511257Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-12-04T12:53:07.9511326Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-12-04T12:53:07.9511395Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-12-04T12:53:07.9511465Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-12-04T12:53:07.9511535Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-12-04T12:53:07.9511604Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-12-04T12:53:07.9511673Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-12-04T12:53:07.9511743Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-12-04T12:53:07.9511812Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-12-04T12:53:07.9511883Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-12-04T12:53:07.9511951Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-12-04T12:53:07.9512019Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-12-04T12:53:07.9512089Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-12-04T12:53:07.9512199Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-12-04T12:53:07.9512269Z * [new branch] gh/janeyx99/305/base -> origin/gh/janeyx99/305/base 2025-12-04T12:53:07.9512339Z * [new branch] gh/janeyx99/305/head -> origin/gh/janeyx99/305/head 2025-12-04T12:53:07.9512407Z * [new branch] gh/janeyx99/306/base -> origin/gh/janeyx99/306/base 2025-12-04T12:53:07.9512477Z * [new branch] gh/janeyx99/306/head -> origin/gh/janeyx99/306/head 2025-12-04T12:53:07.9512547Z * [new branch] gh/janeyx99/314/base -> origin/gh/janeyx99/314/base 2025-12-04T12:53:07.9512616Z * [new branch] gh/janeyx99/314/head -> origin/gh/janeyx99/314/head 2025-12-04T12:53:07.9512686Z * [new branch] gh/janeyx99/314/orig -> origin/gh/janeyx99/314/orig 2025-12-04T12:53:07.9512755Z * [new branch] gh/janeyx99/315/base -> origin/gh/janeyx99/315/base 2025-12-04T12:53:07.9512826Z * [new branch] gh/janeyx99/315/head -> origin/gh/janeyx99/315/head 2025-12-04T12:53:07.9512896Z * [new branch] gh/janeyx99/315/orig -> origin/gh/janeyx99/315/orig 2025-12-04T12:53:07.9512965Z * [new branch] gh/janeyx99/316/base -> origin/gh/janeyx99/316/base 2025-12-04T12:53:07.9513033Z * [new branch] gh/janeyx99/316/head -> origin/gh/janeyx99/316/head 2025-12-04T12:53:07.9513104Z * [new branch] gh/janeyx99/316/orig -> origin/gh/janeyx99/316/orig 2025-12-04T12:53:07.9513175Z * [new branch] gh/janeyx99/317/base -> origin/gh/janeyx99/317/base 2025-12-04T12:53:07.9513244Z * [new branch] gh/janeyx99/317/head -> origin/gh/janeyx99/317/head 2025-12-04T12:53:07.9513314Z * [new branch] gh/janeyx99/317/orig -> origin/gh/janeyx99/317/orig 2025-12-04T12:53:07.9513383Z * [new branch] gh/janeyx99/325/base -> origin/gh/janeyx99/325/base 2025-12-04T12:53:07.9513453Z * [new branch] gh/janeyx99/325/head -> origin/gh/janeyx99/325/head 2025-12-04T12:53:07.9513522Z * [new branch] gh/janeyx99/325/orig -> origin/gh/janeyx99/325/orig 2025-12-04T12:53:07.9513591Z * [new branch] gh/janeyx99/327/base -> origin/gh/janeyx99/327/base 2025-12-04T12:53:07.9513660Z * [new branch] gh/janeyx99/327/head -> origin/gh/janeyx99/327/head 2025-12-04T12:53:07.9513731Z * [new branch] gh/janeyx99/327/orig -> origin/gh/janeyx99/327/orig 2025-12-04T12:53:07.9513837Z * [new branch] gh/janeyx99/328/base -> origin/gh/janeyx99/328/base 2025-12-04T12:53:07.9513907Z * [new branch] gh/janeyx99/328/head -> origin/gh/janeyx99/328/head 2025-12-04T12:53:07.9513975Z * [new branch] gh/janeyx99/328/orig -> origin/gh/janeyx99/328/orig 2025-12-04T12:53:07.9514045Z * [new branch] gh/janeyx99/329/base -> origin/gh/janeyx99/329/base 2025-12-04T12:53:07.9514116Z * [new branch] gh/janeyx99/329/head -> origin/gh/janeyx99/329/head 2025-12-04T12:53:07.9514184Z * [new branch] gh/janeyx99/329/orig -> origin/gh/janeyx99/329/orig 2025-12-04T12:53:07.9514253Z * [new branch] gh/janeyx99/330/base -> origin/gh/janeyx99/330/base 2025-12-04T12:53:07.9514323Z * [new branch] gh/janeyx99/330/head -> origin/gh/janeyx99/330/head 2025-12-04T12:53:07.9514391Z * [new branch] gh/janeyx99/330/orig -> origin/gh/janeyx99/330/orig 2025-12-04T12:53:07.9514461Z * [new branch] gh/janeyx99/331/base -> origin/gh/janeyx99/331/base 2025-12-04T12:53:07.9514531Z * [new branch] gh/janeyx99/331/head -> origin/gh/janeyx99/331/head 2025-12-04T12:53:07.9514599Z * [new branch] gh/janeyx99/331/orig -> origin/gh/janeyx99/331/orig 2025-12-04T12:53:07.9514698Z * [new branch] gh/janeyx99/332/base -> origin/gh/janeyx99/332/base 2025-12-04T12:53:07.9514769Z * [new branch] gh/janeyx99/332/head -> origin/gh/janeyx99/332/head 2025-12-04T12:53:07.9514838Z * [new branch] gh/janeyx99/332/orig -> origin/gh/janeyx99/332/orig 2025-12-04T12:53:07.9514906Z * [new branch] gh/janeyx99/333/base -> origin/gh/janeyx99/333/base 2025-12-04T12:53:07.9514976Z * [new branch] gh/janeyx99/333/head -> origin/gh/janeyx99/333/head 2025-12-04T12:53:07.9515046Z * [new branch] gh/janeyx99/333/orig -> origin/gh/janeyx99/333/orig 2025-12-04T12:53:07.9515116Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-12-04T12:53:07.9515187Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-12-04T12:53:07.9515255Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-12-04T12:53:07.9515326Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-12-04T12:53:07.9515394Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-12-04T12:53:07.9515463Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-12-04T12:53:07.9515532Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-12-04T12:53:07.9515599Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-12-04T12:53:07.9515666Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-12-04T12:53:07.9515735Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-12-04T12:53:07.9515802Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-12-04T12:53:07.9515867Z * [new branch] gh/jansel/533/base -> origin/gh/jansel/533/base 2025-12-04T12:53:07.9515936Z * [new branch] gh/jansel/533/head -> origin/gh/jansel/533/head 2025-12-04T12:53:07.9516003Z * [new branch] gh/jansel/533/orig -> origin/gh/jansel/533/orig 2025-12-04T12:53:07.9516069Z * [new branch] gh/jansel/552/base -> origin/gh/jansel/552/base 2025-12-04T12:53:07.9516137Z * [new branch] gh/jansel/552/head -> origin/gh/jansel/552/head 2025-12-04T12:53:07.9516204Z * [new branch] gh/jansel/552/orig -> origin/gh/jansel/552/orig 2025-12-04T12:53:07.9516269Z * [new branch] gh/jansel/553/base -> origin/gh/jansel/553/base 2025-12-04T12:53:07.9516366Z * [new branch] gh/jansel/553/head -> origin/gh/jansel/553/head 2025-12-04T12:53:07.9516432Z * [new branch] gh/jansel/553/orig -> origin/gh/jansel/553/orig 2025-12-04T12:53:07.9516499Z * [new branch] gh/jansel/554/base -> origin/gh/jansel/554/base 2025-12-04T12:53:07.9516567Z * [new branch] gh/jansel/554/head -> origin/gh/jansel/554/head 2025-12-04T12:53:07.9516633Z * [new branch] gh/jansel/554/orig -> origin/gh/jansel/554/orig 2025-12-04T12:53:07.9516700Z * [new branch] gh/jansel/555/base -> origin/gh/jansel/555/base 2025-12-04T12:53:07.9516769Z * [new branch] gh/jansel/555/head -> origin/gh/jansel/555/head 2025-12-04T12:53:07.9516836Z * [new branch] gh/jansel/555/orig -> origin/gh/jansel/555/orig 2025-12-04T12:53:07.9516902Z * [new branch] gh/jansel/556/base -> origin/gh/jansel/556/base 2025-12-04T12:53:07.9516971Z * [new branch] gh/jansel/556/head -> origin/gh/jansel/556/head 2025-12-04T12:53:07.9517038Z * [new branch] gh/jansel/556/orig -> origin/gh/jansel/556/orig 2025-12-04T12:53:07.9517105Z * [new branch] gh/jansel/557/base -> origin/gh/jansel/557/base 2025-12-04T12:53:07.9517172Z * [new branch] gh/jansel/557/head -> origin/gh/jansel/557/head 2025-12-04T12:53:07.9517265Z * [new branch] gh/jansel/557/orig -> origin/gh/jansel/557/orig 2025-12-04T12:53:07.9517334Z * [new branch] gh/jansel/558/base -> origin/gh/jansel/558/base 2025-12-04T12:53:07.9517401Z * [new branch] gh/jansel/558/head -> origin/gh/jansel/558/head 2025-12-04T12:53:07.9517467Z * [new branch] gh/jansel/558/orig -> origin/gh/jansel/558/orig 2025-12-04T12:53:07.9517534Z * [new branch] gh/jansel/559/base -> origin/gh/jansel/559/base 2025-12-04T12:53:07.9517602Z * [new branch] gh/jansel/559/head -> origin/gh/jansel/559/head 2025-12-04T12:53:07.9517669Z * [new branch] gh/jansel/559/orig -> origin/gh/jansel/559/orig 2025-12-04T12:53:07.9517736Z * [new branch] gh/jansel/560/base -> origin/gh/jansel/560/base 2025-12-04T12:53:07.9517803Z * [new branch] gh/jansel/560/head -> origin/gh/jansel/560/head 2025-12-04T12:53:07.9517871Z * [new branch] gh/jansel/560/orig -> origin/gh/jansel/560/orig 2025-12-04T12:53:07.9517938Z * [new branch] gh/jansel/561/base -> origin/gh/jansel/561/base 2025-12-04T12:53:07.9518005Z * [new branch] gh/jansel/561/head -> origin/gh/jansel/561/head 2025-12-04T12:53:07.9518071Z * [new branch] gh/jansel/561/orig -> origin/gh/jansel/561/orig 2025-12-04T12:53:07.9518138Z * [new branch] gh/jansel/562/base -> origin/gh/jansel/562/base 2025-12-04T12:53:07.9518207Z * [new branch] gh/jansel/562/head -> origin/gh/jansel/562/head 2025-12-04T12:53:07.9518273Z * [new branch] gh/jansel/562/orig -> origin/gh/jansel/562/orig 2025-12-04T12:53:07.9518339Z * [new branch] gh/jansel/563/base -> origin/gh/jansel/563/base 2025-12-04T12:53:07.9518405Z * [new branch] gh/jansel/563/head -> origin/gh/jansel/563/head 2025-12-04T12:53:07.9518473Z * [new branch] gh/jansel/563/orig -> origin/gh/jansel/563/orig 2025-12-04T12:53:07.9518541Z * [new branch] gh/jansel/564/base -> origin/gh/jansel/564/base 2025-12-04T12:53:07.9518606Z * [new branch] gh/jansel/564/head -> origin/gh/jansel/564/head 2025-12-04T12:53:07.9518673Z * [new branch] gh/jansel/564/orig -> origin/gh/jansel/564/orig 2025-12-04T12:53:07.9518739Z * [new branch] gh/jansel/565/base -> origin/gh/jansel/565/base 2025-12-04T12:53:07.9518837Z * [new branch] gh/jansel/565/head -> origin/gh/jansel/565/head 2025-12-04T12:53:07.9518904Z * [new branch] gh/jansel/565/orig -> origin/gh/jansel/565/orig 2025-12-04T12:53:07.9518971Z * [new branch] gh/jansel/566/base -> origin/gh/jansel/566/base 2025-12-04T12:53:07.9519037Z * [new branch] gh/jansel/566/head -> origin/gh/jansel/566/head 2025-12-04T12:53:07.9519106Z * [new branch] gh/jansel/566/orig -> origin/gh/jansel/566/orig 2025-12-04T12:53:07.9519172Z * [new branch] gh/jansel/567/base -> origin/gh/jansel/567/base 2025-12-04T12:53:07.9519238Z * [new branch] gh/jansel/567/head -> origin/gh/jansel/567/head 2025-12-04T12:53:07.9519306Z * [new branch] gh/jansel/567/orig -> origin/gh/jansel/567/orig 2025-12-04T12:53:07.9519373Z * [new branch] gh/jansel/568/base -> origin/gh/jansel/568/base 2025-12-04T12:53:07.9519441Z * [new branch] gh/jansel/568/head -> origin/gh/jansel/568/head 2025-12-04T12:53:07.9519509Z * [new branch] gh/jansel/568/orig -> origin/gh/jansel/568/orig 2025-12-04T12:53:07.9519575Z * [new branch] gh/jansel/569/base -> origin/gh/jansel/569/base 2025-12-04T12:53:07.9519642Z * [new branch] gh/jansel/569/head -> origin/gh/jansel/569/head 2025-12-04T12:53:07.9519742Z * [new branch] gh/jansel/569/orig -> origin/gh/jansel/569/orig 2025-12-04T12:53:07.9519809Z * [new branch] gh/jansel/570/base -> origin/gh/jansel/570/base 2025-12-04T12:53:07.9519875Z * [new branch] gh/jansel/570/head -> origin/gh/jansel/570/head 2025-12-04T12:53:07.9519943Z * [new branch] gh/jansel/570/orig -> origin/gh/jansel/570/orig 2025-12-04T12:53:07.9520009Z * [new branch] gh/jansel/571/base -> origin/gh/jansel/571/base 2025-12-04T12:53:07.9520077Z * [new branch] gh/jansel/571/head -> origin/gh/jansel/571/head 2025-12-04T12:53:07.9520144Z * [new branch] gh/jansel/571/orig -> origin/gh/jansel/571/orig 2025-12-04T12:53:07.9520248Z * [new branch] gh/jansel/572/base -> origin/gh/jansel/572/base 2025-12-04T12:53:07.9520317Z * [new branch] gh/jansel/572/head -> origin/gh/jansel/572/head 2025-12-04T12:53:07.9520384Z * [new branch] gh/jansel/572/orig -> origin/gh/jansel/572/orig 2025-12-04T12:53:07.9520450Z * [new branch] gh/jansel/573/base -> origin/gh/jansel/573/base 2025-12-04T12:53:07.9520518Z * [new branch] gh/jansel/573/head -> origin/gh/jansel/573/head 2025-12-04T12:53:07.9520584Z * [new branch] gh/jansel/573/orig -> origin/gh/jansel/573/orig 2025-12-04T12:53:07.9520650Z * [new branch] gh/jansel/574/base -> origin/gh/jansel/574/base 2025-12-04T12:53:07.9520719Z * [new branch] gh/jansel/574/head -> origin/gh/jansel/574/head 2025-12-04T12:53:07.9520785Z * [new branch] gh/jansel/574/orig -> origin/gh/jansel/574/orig 2025-12-04T12:53:07.9520851Z * [new branch] gh/jansel/575/base -> origin/gh/jansel/575/base 2025-12-04T12:53:07.9520918Z * [new branch] gh/jansel/575/head -> origin/gh/jansel/575/head 2025-12-04T12:53:07.9520985Z * [new branch] gh/jansel/575/orig -> origin/gh/jansel/575/orig 2025-12-04T12:53:07.9521052Z * [new branch] gh/jansel/576/base -> origin/gh/jansel/576/base 2025-12-04T12:53:07.9521119Z * [new branch] gh/jansel/576/head -> origin/gh/jansel/576/head 2025-12-04T12:53:07.9521185Z * [new branch] gh/jansel/576/orig -> origin/gh/jansel/576/orig 2025-12-04T12:53:07.9521266Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-12-04T12:53:07.9521387Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-12-04T12:53:07.9521462Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-12-04T12:53:07.9521536Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-12-04T12:53:07.9521611Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-12-04T12:53:07.9521686Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-12-04T12:53:07.9521759Z * [new branch] gh/jerryzh168/1/base -> origin/gh/jerryzh168/1/base 2025-12-04T12:53:07.9521830Z * [new branch] gh/jerryzh168/1/head -> origin/gh/jerryzh168/1/head 2025-12-04T12:53:07.9521901Z * [new branch] gh/jerryzh168/1/orig -> origin/gh/jerryzh168/1/orig 2025-12-04T12:53:07.9521974Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-12-04T12:53:07.9522045Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-12-04T12:53:07.9522115Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-12-04T12:53:07.9522186Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-12-04T12:53:07.9522256Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-12-04T12:53:07.9522374Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-12-04T12:53:07.9522444Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-12-04T12:53:07.9522514Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-12-04T12:53:07.9522584Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-12-04T12:53:07.9522655Z * [new branch] gh/jiayisunx/77/base -> origin/gh/jiayisunx/77/base 2025-12-04T12:53:07.9522726Z * [new branch] gh/jiayisunx/77/head -> origin/gh/jiayisunx/77/head 2025-12-04T12:53:07.9522796Z * [new branch] gh/jiayisunx/77/orig -> origin/gh/jiayisunx/77/orig 2025-12-04T12:53:07.9522868Z * [new branch] gh/jiayisunx/78/base -> origin/gh/jiayisunx/78/base 2025-12-04T12:53:07.9522939Z * [new branch] gh/jiayisunx/78/head -> origin/gh/jiayisunx/78/head 2025-12-04T12:53:07.9523008Z * [new branch] gh/jiayisunx/78/orig -> origin/gh/jiayisunx/78/orig 2025-12-04T12:53:07.9523080Z * [new branch] gh/jiayisunx/79/base -> origin/gh/jiayisunx/79/base 2025-12-04T12:53:07.9523150Z * [new branch] gh/jiayisunx/79/head -> origin/gh/jiayisunx/79/head 2025-12-04T12:53:07.9523220Z * [new branch] gh/jiayisunx/79/orig -> origin/gh/jiayisunx/79/orig 2025-12-04T12:53:07.9523291Z * [new branch] gh/jiayisunx/82/base -> origin/gh/jiayisunx/82/base 2025-12-04T12:53:07.9523363Z * [new branch] gh/jiayisunx/82/head -> origin/gh/jiayisunx/82/head 2025-12-04T12:53:07.9523435Z * [new branch] gh/jiayisunx/82/orig -> origin/gh/jiayisunx/82/orig 2025-12-04T12:53:07.9523505Z * [new branch] gh/jiayisunx/83/base -> origin/gh/jiayisunx/83/base 2025-12-04T12:53:07.9523576Z * [new branch] gh/jiayisunx/83/head -> origin/gh/jiayisunx/83/head 2025-12-04T12:53:07.9523649Z * [new branch] gh/jiayisunx/83/orig -> origin/gh/jiayisunx/83/orig 2025-12-04T12:53:07.9523720Z * [new branch] gh/jiayisunx/84/base -> origin/gh/jiayisunx/84/base 2025-12-04T12:53:07.9523791Z * [new branch] gh/jiayisunx/84/head -> origin/gh/jiayisunx/84/head 2025-12-04T12:53:07.9523863Z * [new branch] gh/jiayisunx/84/orig -> origin/gh/jiayisunx/84/orig 2025-12-04T12:53:07.9523934Z * [new branch] gh/jiayisunx/85/base -> origin/gh/jiayisunx/85/base 2025-12-04T12:53:07.9524033Z * [new branch] gh/jiayisunx/85/head -> origin/gh/jiayisunx/85/head 2025-12-04T12:53:07.9524107Z * [new branch] gh/jiayisunx/85/orig -> origin/gh/jiayisunx/85/orig 2025-12-04T12:53:07.9524177Z * [new branch] gh/jiayisunx/86/base -> origin/gh/jiayisunx/86/base 2025-12-04T12:53:07.9524248Z * [new branch] gh/jiayisunx/86/head -> origin/gh/jiayisunx/86/head 2025-12-04T12:53:07.9524320Z * [new branch] gh/jiayisunx/86/orig -> origin/gh/jiayisunx/86/orig 2025-12-04T12:53:07.9524389Z * [new branch] gh/jiayisunx/87/base -> origin/gh/jiayisunx/87/base 2025-12-04T12:53:07.9524459Z * [new branch] gh/jiayisunx/87/head -> origin/gh/jiayisunx/87/head 2025-12-04T12:53:07.9524531Z * [new branch] gh/jiayisunx/87/orig -> origin/gh/jiayisunx/87/orig 2025-12-04T12:53:07.9524601Z * [new branch] gh/jiayisunx/88/base -> origin/gh/jiayisunx/88/base 2025-12-04T12:53:07.9524672Z * [new branch] gh/jiayisunx/88/head -> origin/gh/jiayisunx/88/head 2025-12-04T12:53:07.9524744Z * [new branch] gh/jiayisunx/88/orig -> origin/gh/jiayisunx/88/orig 2025-12-04T12:53:07.9524813Z * [new branch] gh/jiayisunx/89/base -> origin/gh/jiayisunx/89/base 2025-12-04T12:53:07.9524917Z * [new branch] gh/jiayisunx/89/head -> origin/gh/jiayisunx/89/head 2025-12-04T12:53:07.9524989Z * [new branch] gh/jiayisunx/89/orig -> origin/gh/jiayisunx/89/orig 2025-12-04T12:53:07.9525060Z * [new branch] gh/jiayisunx/90/base -> origin/gh/jiayisunx/90/base 2025-12-04T12:53:07.9525131Z * [new branch] gh/jiayisunx/90/head -> origin/gh/jiayisunx/90/head 2025-12-04T12:53:07.9525201Z * [new branch] gh/jiayisunx/90/orig -> origin/gh/jiayisunx/90/orig 2025-12-04T12:53:07.9525277Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-12-04T12:53:07.9525355Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-12-04T12:53:07.9525424Z * [new branch] gh/jturney/1/base -> origin/gh/jturney/1/base 2025-12-04T12:53:07.9525492Z * [new branch] gh/jturney/1/head -> origin/gh/jturney/1/head 2025-12-04T12:53:07.9525563Z * [new branch] gh/jturney/1/orig -> origin/gh/jturney/1/orig 2025-12-04T12:53:07.9525630Z * [new branch] gh/jturney/2/base -> origin/gh/jturney/2/base 2025-12-04T12:53:07.9525696Z * [new branch] gh/jturney/2/head -> origin/gh/jturney/2/head 2025-12-04T12:53:07.9525763Z * [new branch] gh/jturney/2/orig -> origin/gh/jturney/2/orig 2025-12-04T12:53:07.9525839Z * [new branch] gh/karthickai/10/base -> origin/gh/karthickai/10/base 2025-12-04T12:53:07.9525914Z * [new branch] gh/karthickai/10/head -> origin/gh/karthickai/10/head 2025-12-04T12:53:07.9525990Z * [new branch] gh/karthickai/10/orig -> origin/gh/karthickai/10/orig 2025-12-04T12:53:07.9526062Z * [new branch] gh/karthickai/11/base -> origin/gh/karthickai/11/base 2025-12-04T12:53:07.9526134Z * [new branch] gh/karthickai/11/head -> origin/gh/karthickai/11/head 2025-12-04T12:53:07.9526207Z * [new branch] gh/karthickai/11/orig -> origin/gh/karthickai/11/orig 2025-12-04T12:53:07.9526279Z * [new branch] gh/karthickai/12/base -> origin/gh/karthickai/12/base 2025-12-04T12:53:07.9526351Z * [new branch] gh/karthickai/12/head -> origin/gh/karthickai/12/head 2025-12-04T12:53:07.9526425Z * [new branch] gh/karthickai/12/orig -> origin/gh/karthickai/12/orig 2025-12-04T12:53:07.9526497Z * [new branch] gh/karthickai/13/base -> origin/gh/karthickai/13/base 2025-12-04T12:53:07.9526570Z * [new branch] gh/karthickai/13/head -> origin/gh/karthickai/13/head 2025-12-04T12:53:07.9526671Z * [new branch] gh/karthickai/13/orig -> origin/gh/karthickai/13/orig 2025-12-04T12:53:07.9526743Z * [new branch] gh/karthickai/14/base -> origin/gh/karthickai/14/base 2025-12-04T12:53:07.9526816Z * [new branch] gh/karthickai/14/head -> origin/gh/karthickai/14/head 2025-12-04T12:53:07.9526888Z * [new branch] gh/karthickai/14/orig -> origin/gh/karthickai/14/orig 2025-12-04T12:53:07.9526960Z * [new branch] gh/karthickai/15/base -> origin/gh/karthickai/15/base 2025-12-04T12:53:07.9527032Z * [new branch] gh/karthickai/15/head -> origin/gh/karthickai/15/head 2025-12-04T12:53:07.9527103Z * [new branch] gh/karthickai/15/orig -> origin/gh/karthickai/15/orig 2025-12-04T12:53:07.9527174Z * [new branch] gh/karthickai/16/base -> origin/gh/karthickai/16/base 2025-12-04T12:53:07.9527249Z * [new branch] gh/karthickai/16/head -> origin/gh/karthickai/16/head 2025-12-04T12:53:07.9527320Z * [new branch] gh/karthickai/16/orig -> origin/gh/karthickai/16/orig 2025-12-04T12:53:07.9527391Z * [new branch] gh/karthickai/17/base -> origin/gh/karthickai/17/base 2025-12-04T12:53:07.9527463Z * [new branch] gh/karthickai/17/head -> origin/gh/karthickai/17/head 2025-12-04T12:53:07.9527671Z * [new branch] gh/karthickai/17/orig -> origin/gh/karthickai/17/orig 2025-12-04T12:53:07.9527743Z * [new branch] gh/karthickai/18/base -> origin/gh/karthickai/18/base 2025-12-04T12:53:07.9527816Z * [new branch] gh/karthickai/18/head -> origin/gh/karthickai/18/head 2025-12-04T12:53:07.9527888Z * [new branch] gh/karthickai/18/orig -> origin/gh/karthickai/18/orig 2025-12-04T12:53:07.9527959Z * [new branch] gh/karthickai/19/base -> origin/gh/karthickai/19/base 2025-12-04T12:53:07.9528034Z * [new branch] gh/karthickai/19/head -> origin/gh/karthickai/19/head 2025-12-04T12:53:07.9528105Z * [new branch] gh/karthickai/19/orig -> origin/gh/karthickai/19/orig 2025-12-04T12:53:07.9528177Z * [new branch] gh/karthickai/20/base -> origin/gh/karthickai/20/base 2025-12-04T12:53:07.9528249Z * [new branch] gh/karthickai/20/head -> origin/gh/karthickai/20/head 2025-12-04T12:53:07.9528322Z * [new branch] gh/karthickai/20/orig -> origin/gh/karthickai/20/orig 2025-12-04T12:53:07.9528394Z * [new branch] gh/karthickai/21/base -> origin/gh/karthickai/21/base 2025-12-04T12:53:07.9528466Z * [new branch] gh/karthickai/21/head -> origin/gh/karthickai/21/head 2025-12-04T12:53:07.9528537Z * [new branch] gh/karthickai/21/orig -> origin/gh/karthickai/21/orig 2025-12-04T12:53:07.9528609Z * [new branch] gh/karthickai/22/base -> origin/gh/karthickai/22/base 2025-12-04T12:53:07.9528682Z * [new branch] gh/karthickai/22/head -> origin/gh/karthickai/22/head 2025-12-04T12:53:07.9528754Z * [new branch] gh/karthickai/22/orig -> origin/gh/karthickai/22/orig 2025-12-04T12:53:07.9528827Z * [new branch] gh/karthickai/23/base -> origin/gh/karthickai/23/base 2025-12-04T12:53:07.9528900Z * [new branch] gh/karthickai/23/head -> origin/gh/karthickai/23/head 2025-12-04T12:53:07.9528973Z * [new branch] gh/karthickai/23/orig -> origin/gh/karthickai/23/orig 2025-12-04T12:53:07.9529046Z * [new branch] gh/karthickai/24/base -> origin/gh/karthickai/24/base 2025-12-04T12:53:07.9529118Z * [new branch] gh/karthickai/24/head -> origin/gh/karthickai/24/head 2025-12-04T12:53:07.9529189Z * [new branch] gh/karthickai/24/orig -> origin/gh/karthickai/24/orig 2025-12-04T12:53:07.9529262Z * [new branch] gh/karthickai/25/base -> origin/gh/karthickai/25/base 2025-12-04T12:53:07.9529361Z * [new branch] gh/karthickai/25/head -> origin/gh/karthickai/25/head 2025-12-04T12:53:07.9529433Z * [new branch] gh/karthickai/25/orig -> origin/gh/karthickai/25/orig 2025-12-04T12:53:07.9529505Z * [new branch] gh/karthickai/26/base -> origin/gh/karthickai/26/base 2025-12-04T12:53:07.9529578Z * [new branch] gh/karthickai/26/head -> origin/gh/karthickai/26/head 2025-12-04T12:53:07.9529650Z * [new branch] gh/karthickai/26/orig -> origin/gh/karthickai/26/orig 2025-12-04T12:53:07.9529724Z * [new branch] gh/karthickai/6/base -> origin/gh/karthickai/6/base 2025-12-04T12:53:07.9529796Z * [new branch] gh/karthickai/6/head -> origin/gh/karthickai/6/head 2025-12-04T12:53:07.9529868Z * [new branch] gh/karthickai/6/orig -> origin/gh/karthickai/6/orig 2025-12-04T12:53:07.9529935Z * [new branch] gh/krocki/1/base -> origin/gh/krocki/1/base 2025-12-04T12:53:07.9530003Z * [new branch] gh/krocki/1/head -> origin/gh/krocki/1/head 2025-12-04T12:53:07.9530070Z * [new branch] gh/krocki/1/orig -> origin/gh/krocki/1/orig 2025-12-04T12:53:07.9530135Z * [new branch] gh/krocki/2/base -> origin/gh/krocki/2/base 2025-12-04T12:53:07.9530295Z * [new branch] gh/krocki/2/head -> origin/gh/krocki/2/head 2025-12-04T12:53:07.9530363Z * [new branch] gh/krocki/2/orig -> origin/gh/krocki/2/orig 2025-12-04T12:53:07.9530440Z * [new branch] gh/kurtamohler/60/base -> origin/gh/kurtamohler/60/base 2025-12-04T12:53:07.9530516Z * [new branch] gh/kurtamohler/60/head -> origin/gh/kurtamohler/60/head 2025-12-04T12:53:07.9530591Z * [new branch] gh/kurtamohler/60/orig -> origin/gh/kurtamohler/60/orig 2025-12-04T12:53:07.9530665Z * [new branch] gh/kurtamohler/61/base -> origin/gh/kurtamohler/61/base 2025-12-04T12:53:07.9530740Z * [new branch] gh/kurtamohler/61/head -> origin/gh/kurtamohler/61/head 2025-12-04T12:53:07.9530816Z * [new branch] gh/kurtamohler/61/orig -> origin/gh/kurtamohler/61/orig 2025-12-04T12:53:07.9530889Z * [new branch] gh/kurtamohler/62/base -> origin/gh/kurtamohler/62/base 2025-12-04T12:53:07.9530962Z * [new branch] gh/kurtamohler/62/head -> origin/gh/kurtamohler/62/head 2025-12-04T12:53:07.9531036Z * [new branch] gh/kurtamohler/62/orig -> origin/gh/kurtamohler/62/orig 2025-12-04T12:53:07.9531109Z * [new branch] gh/kurtamohler/63/base -> origin/gh/kurtamohler/63/base 2025-12-04T12:53:07.9531182Z * [new branch] gh/kurtamohler/63/head -> origin/gh/kurtamohler/63/head 2025-12-04T12:53:07.9531258Z * [new branch] gh/kurtamohler/63/orig -> origin/gh/kurtamohler/63/orig 2025-12-04T12:53:07.9531331Z * [new branch] gh/kurtamohler/64/base -> origin/gh/kurtamohler/64/base 2025-12-04T12:53:07.9531407Z * [new branch] gh/kurtamohler/64/head -> origin/gh/kurtamohler/64/head 2025-12-04T12:53:07.9531479Z * [new branch] gh/kurtamohler/64/orig -> origin/gh/kurtamohler/64/orig 2025-12-04T12:53:07.9531552Z * [new branch] gh/kurtamohler/65/base -> origin/gh/kurtamohler/65/base 2025-12-04T12:53:07.9531626Z * [new branch] gh/kurtamohler/65/head -> origin/gh/kurtamohler/65/head 2025-12-04T12:53:07.9531699Z * [new branch] gh/kurtamohler/65/orig -> origin/gh/kurtamohler/65/orig 2025-12-04T12:53:07.9531773Z * [new branch] gh/kurtamohler/66/base -> origin/gh/kurtamohler/66/base 2025-12-04T12:53:07.9531850Z * [new branch] gh/kurtamohler/66/head -> origin/gh/kurtamohler/66/head 2025-12-04T12:53:07.9531922Z * [new branch] gh/kurtamohler/66/orig -> origin/gh/kurtamohler/66/orig 2025-12-04T12:53:07.9532039Z * [new branch] gh/kurtamohler/67/base -> origin/gh/kurtamohler/67/base 2025-12-04T12:53:07.9532112Z * [new branch] gh/kurtamohler/67/head -> origin/gh/kurtamohler/67/head 2025-12-04T12:53:07.9532185Z * [new branch] gh/kurtamohler/67/orig -> origin/gh/kurtamohler/67/orig 2025-12-04T12:53:07.9532255Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-12-04T12:53:07.9532326Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-12-04T12:53:07.9532395Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-12-04T12:53:07.9532464Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-12-04T12:53:07.9532534Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-12-04T12:53:07.9532602Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-12-04T12:53:07.9532673Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-12-04T12:53:07.9532742Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-12-04T12:53:07.9532809Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-12-04T12:53:07.9532876Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-12-04T12:53:07.9532982Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-12-04T12:53:07.9533051Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-12-04T12:53:07.9533120Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-12-04T12:53:07.9533188Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-12-04T12:53:07.9533257Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-12-04T12:53:07.9533327Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-12-04T12:53:07.9533396Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-12-04T12:53:07.9533464Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-12-04T12:53:07.9533535Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-12-04T12:53:07.9533603Z * [new branch] gh/kwen2501/234/base -> origin/gh/kwen2501/234/base 2025-12-04T12:53:07.9533672Z * [new branch] gh/kwen2501/234/head -> origin/gh/kwen2501/234/head 2025-12-04T12:53:07.9533741Z * [new branch] gh/kwen2501/234/orig -> origin/gh/kwen2501/234/orig 2025-12-04T12:53:07.9533809Z * [new branch] gh/kwen2501/235/base -> origin/gh/kwen2501/235/base 2025-12-04T12:53:07.9533876Z * [new branch] gh/kwen2501/235/head -> origin/gh/kwen2501/235/head 2025-12-04T12:53:07.9533948Z * [new branch] gh/kwen2501/235/orig -> origin/gh/kwen2501/235/orig 2025-12-04T12:53:07.9534016Z * [new branch] gh/kwen2501/236/base -> origin/gh/kwen2501/236/base 2025-12-04T12:53:07.9534083Z * [new branch] gh/kwen2501/236/head -> origin/gh/kwen2501/236/head 2025-12-04T12:53:07.9534154Z * [new branch] gh/kwen2501/236/orig -> origin/gh/kwen2501/236/orig 2025-12-04T12:53:07.9534221Z * [new branch] gh/kwen2501/237/base -> origin/gh/kwen2501/237/base 2025-12-04T12:53:07.9534288Z * [new branch] gh/kwen2501/237/head -> origin/gh/kwen2501/237/head 2025-12-04T12:53:07.9534356Z * [new branch] gh/kwen2501/237/orig -> origin/gh/kwen2501/237/orig 2025-12-04T12:53:07.9534423Z * [new branch] gh/kwen2501/238/base -> origin/gh/kwen2501/238/base 2025-12-04T12:53:07.9534492Z * [new branch] gh/kwen2501/238/head -> origin/gh/kwen2501/238/head 2025-12-04T12:53:07.9534587Z * [new branch] gh/kwen2501/238/orig -> origin/gh/kwen2501/238/orig 2025-12-04T12:53:07.9534655Z * [new branch] gh/kwen2501/240/base -> origin/gh/kwen2501/240/base 2025-12-04T12:53:07.9534724Z * [new branch] gh/kwen2501/240/head -> origin/gh/kwen2501/240/head 2025-12-04T12:53:07.9534793Z * [new branch] gh/kwen2501/240/orig -> origin/gh/kwen2501/240/orig 2025-12-04T12:53:07.9534861Z * [new branch] gh/kwen2501/241/base -> origin/gh/kwen2501/241/base 2025-12-04T12:53:07.9534930Z * [new branch] gh/kwen2501/241/head -> origin/gh/kwen2501/241/head 2025-12-04T12:53:07.9534998Z * [new branch] gh/kwen2501/241/orig -> origin/gh/kwen2501/241/orig 2025-12-04T12:53:07.9535066Z * [new branch] gh/kwen2501/247/base -> origin/gh/kwen2501/247/base 2025-12-04T12:53:07.9535135Z * [new branch] gh/kwen2501/247/head -> origin/gh/kwen2501/247/head 2025-12-04T12:53:07.9535204Z * [new branch] gh/kwen2501/247/orig -> origin/gh/kwen2501/247/orig 2025-12-04T12:53:07.9535272Z * [new branch] gh/kwen2501/252/base -> origin/gh/kwen2501/252/base 2025-12-04T12:53:07.9535341Z * [new branch] gh/kwen2501/252/head -> origin/gh/kwen2501/252/head 2025-12-04T12:53:07.9535442Z * [new branch] gh/kwen2501/252/orig -> origin/gh/kwen2501/252/orig 2025-12-04T12:53:07.9535510Z * [new branch] gh/kwen2501/259/base -> origin/gh/kwen2501/259/base 2025-12-04T12:53:07.9535579Z * [new branch] gh/kwen2501/259/head -> origin/gh/kwen2501/259/head 2025-12-04T12:53:07.9535646Z * [new branch] gh/kwen2501/259/orig -> origin/gh/kwen2501/259/orig 2025-12-04T12:53:07.9535714Z * [new branch] gh/kwen2501/260/base -> origin/gh/kwen2501/260/base 2025-12-04T12:53:07.9535783Z * [new branch] gh/kwen2501/260/head -> origin/gh/kwen2501/260/head 2025-12-04T12:53:07.9535852Z * [new branch] gh/kwen2501/260/orig -> origin/gh/kwen2501/260/orig 2025-12-04T12:53:07.9535922Z * [new branch] gh/kwen2501/268/base -> origin/gh/kwen2501/268/base 2025-12-04T12:53:07.9535989Z * [new branch] gh/kwen2501/268/head -> origin/gh/kwen2501/268/head 2025-12-04T12:53:07.9536059Z * [new branch] gh/kwen2501/268/orig -> origin/gh/kwen2501/268/orig 2025-12-04T12:53:07.9536127Z * [new branch] gh/kwen2501/269/base -> origin/gh/kwen2501/269/base 2025-12-04T12:53:07.9536195Z * [new branch] gh/kwen2501/269/head -> origin/gh/kwen2501/269/head 2025-12-04T12:53:07.9536263Z * [new branch] gh/kwen2501/269/orig -> origin/gh/kwen2501/269/orig 2025-12-04T12:53:07.9536331Z * [new branch] gh/kwen2501/270/base -> origin/gh/kwen2501/270/base 2025-12-04T12:53:07.9536399Z * [new branch] gh/kwen2501/270/head -> origin/gh/kwen2501/270/head 2025-12-04T12:53:07.9536468Z * [new branch] gh/kwen2501/270/orig -> origin/gh/kwen2501/270/orig 2025-12-04T12:53:07.9536537Z * [new branch] gh/kwen2501/271/base -> origin/gh/kwen2501/271/base 2025-12-04T12:53:07.9536604Z * [new branch] gh/kwen2501/271/head -> origin/gh/kwen2501/271/head 2025-12-04T12:53:07.9536673Z * [new branch] gh/kwen2501/271/orig -> origin/gh/kwen2501/271/orig 2025-12-04T12:53:07.9536742Z * [new branch] gh/kwen2501/274/base -> origin/gh/kwen2501/274/base 2025-12-04T12:53:07.9536809Z * [new branch] gh/kwen2501/274/head -> origin/gh/kwen2501/274/head 2025-12-04T12:53:07.9536877Z * [new branch] gh/kwen2501/274/orig -> origin/gh/kwen2501/274/orig 2025-12-04T12:53:07.9536946Z * [new branch] gh/kwen2501/275/base -> origin/gh/kwen2501/275/base 2025-12-04T12:53:07.9537042Z * [new branch] gh/kwen2501/275/head -> origin/gh/kwen2501/275/head 2025-12-04T12:53:07.9537110Z * [new branch] gh/kwen2501/275/orig -> origin/gh/kwen2501/275/orig 2025-12-04T12:53:07.9537179Z * [new branch] gh/kwen2501/276/base -> origin/gh/kwen2501/276/base 2025-12-04T12:53:07.9537246Z * [new branch] gh/kwen2501/276/head -> origin/gh/kwen2501/276/head 2025-12-04T12:53:07.9537314Z * [new branch] gh/kwen2501/276/orig -> origin/gh/kwen2501/276/orig 2025-12-04T12:53:07.9537384Z * [new branch] gh/kwen2501/277/base -> origin/gh/kwen2501/277/base 2025-12-04T12:53:07.9537452Z * [new branch] gh/kwen2501/277/head -> origin/gh/kwen2501/277/head 2025-12-04T12:53:07.9537521Z * [new branch] gh/kwen2501/277/orig -> origin/gh/kwen2501/277/orig 2025-12-04T12:53:07.9537589Z * [new branch] gh/kwen2501/278/base -> origin/gh/kwen2501/278/base 2025-12-04T12:53:07.9537658Z * [new branch] gh/kwen2501/278/head -> origin/gh/kwen2501/278/head 2025-12-04T12:53:07.9537727Z * [new branch] gh/kwen2501/278/orig -> origin/gh/kwen2501/278/orig 2025-12-04T12:53:07.9537795Z * [new branch] gh/kwen2501/279/base -> origin/gh/kwen2501/279/base 2025-12-04T12:53:07.9537863Z * [new branch] gh/kwen2501/279/head -> origin/gh/kwen2501/279/head 2025-12-04T12:53:07.9537959Z * [new branch] gh/kwen2501/279/orig -> origin/gh/kwen2501/279/orig 2025-12-04T12:53:07.9538027Z * [new branch] gh/kwen2501/280/base -> origin/gh/kwen2501/280/base 2025-12-04T12:53:07.9538094Z * [new branch] gh/kwen2501/280/head -> origin/gh/kwen2501/280/head 2025-12-04T12:53:07.9538164Z * [new branch] gh/kwen2501/280/orig -> origin/gh/kwen2501/280/orig 2025-12-04T12:53:07.9538234Z * [new branch] gh/kwen2501/281/base -> origin/gh/kwen2501/281/base 2025-12-04T12:53:07.9538303Z * [new branch] gh/kwen2501/281/head -> origin/gh/kwen2501/281/head 2025-12-04T12:53:07.9538372Z * [new branch] gh/kwen2501/281/orig -> origin/gh/kwen2501/281/orig 2025-12-04T12:53:07.9538440Z * [new branch] gh/kwen2501/282/base -> origin/gh/kwen2501/282/base 2025-12-04T12:53:07.9538508Z * [new branch] gh/kwen2501/282/head -> origin/gh/kwen2501/282/head 2025-12-04T12:53:07.9538578Z * [new branch] gh/kwen2501/282/orig -> origin/gh/kwen2501/282/orig 2025-12-04T12:53:07.9538645Z * [new branch] gh/kwen2501/283/base -> origin/gh/kwen2501/283/base 2025-12-04T12:53:07.9538713Z * [new branch] gh/kwen2501/283/head -> origin/gh/kwen2501/283/head 2025-12-04T12:53:07.9538783Z * [new branch] gh/kwen2501/283/orig -> origin/gh/kwen2501/283/orig 2025-12-04T12:53:07.9538850Z * [new branch] gh/kwen2501/284/base -> origin/gh/kwen2501/284/base 2025-12-04T12:53:07.9538920Z * [new branch] gh/kwen2501/284/head -> origin/gh/kwen2501/284/head 2025-12-04T12:53:07.9538989Z * [new branch] gh/kwen2501/284/orig -> origin/gh/kwen2501/284/orig 2025-12-04T12:53:07.9539056Z * [new branch] gh/kwen2501/285/base -> origin/gh/kwen2501/285/base 2025-12-04T12:53:07.9539125Z * [new branch] gh/kwen2501/285/head -> origin/gh/kwen2501/285/head 2025-12-04T12:53:07.9539193Z * [new branch] gh/kwen2501/285/orig -> origin/gh/kwen2501/285/orig 2025-12-04T12:53:07.9539262Z * [new branch] gh/kwen2501/286/base -> origin/gh/kwen2501/286/base 2025-12-04T12:53:07.9539331Z * [new branch] gh/kwen2501/286/head -> origin/gh/kwen2501/286/head 2025-12-04T12:53:07.9539398Z * [new branch] gh/kwen2501/286/orig -> origin/gh/kwen2501/286/orig 2025-12-04T12:53:07.9539466Z * [new branch] gh/kwen2501/287/base -> origin/gh/kwen2501/287/base 2025-12-04T12:53:07.9539565Z * [new branch] gh/kwen2501/287/head -> origin/gh/kwen2501/287/head 2025-12-04T12:53:07.9539633Z * [new branch] gh/kwen2501/287/orig -> origin/gh/kwen2501/287/orig 2025-12-04T12:53:07.9539701Z * [new branch] gh/kwen2501/288/base -> origin/gh/kwen2501/288/base 2025-12-04T12:53:07.9539771Z * [new branch] gh/kwen2501/288/head -> origin/gh/kwen2501/288/head 2025-12-04T12:53:07.9539839Z * [new branch] gh/kwen2501/288/orig -> origin/gh/kwen2501/288/orig 2025-12-04T12:53:07.9539914Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-12-04T12:53:07.9539988Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-12-04T12:53:07.9540060Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-12-04T12:53:07.9540132Z * [new branch] gh/laithsakka/276/base -> origin/gh/laithsakka/276/base 2025-12-04T12:53:07.9540252Z * [new branch] gh/laithsakka/276/head -> origin/gh/laithsakka/276/head 2025-12-04T12:53:07.9540326Z * [new branch] gh/laithsakka/276/orig -> origin/gh/laithsakka/276/orig 2025-12-04T12:53:07.9540401Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-12-04T12:53:07.9540523Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-12-04T12:53:07.9540595Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-12-04T12:53:07.9540669Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-12-04T12:53:07.9540740Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-12-04T12:53:07.9540812Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-12-04T12:53:07.9540889Z * [new branch] gh/laithsakka/313/base -> origin/gh/laithsakka/313/base 2025-12-04T12:53:07.9540964Z * [new branch] gh/laithsakka/313/head -> origin/gh/laithsakka/313/head 2025-12-04T12:53:07.9541038Z * [new branch] gh/laithsakka/313/orig -> origin/gh/laithsakka/313/orig 2025-12-04T12:53:07.9541111Z * [new branch] gh/laithsakka/316/base -> origin/gh/laithsakka/316/base 2025-12-04T12:53:07.9541184Z * [new branch] gh/laithsakka/316/head -> origin/gh/laithsakka/316/head 2025-12-04T12:53:07.9541256Z * [new branch] gh/laithsakka/316/orig -> origin/gh/laithsakka/316/orig 2025-12-04T12:53:07.9541328Z * [new branch] gh/laithsakka/317/base -> origin/gh/laithsakka/317/base 2025-12-04T12:53:07.9541400Z * [new branch] gh/laithsakka/317/head -> origin/gh/laithsakka/317/head 2025-12-04T12:53:07.9541472Z * [new branch] gh/laithsakka/317/orig -> origin/gh/laithsakka/317/orig 2025-12-04T12:53:07.9541547Z * [new branch] gh/laithsakka/319/base -> origin/gh/laithsakka/319/base 2025-12-04T12:53:07.9541619Z * [new branch] gh/laithsakka/319/head -> origin/gh/laithsakka/319/head 2025-12-04T12:53:07.9541690Z * [new branch] gh/laithsakka/319/orig -> origin/gh/laithsakka/319/orig 2025-12-04T12:53:07.9541764Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-12-04T12:53:07.9541837Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-12-04T12:53:07.9541910Z * [new branch] gh/laithsakka/320/base -> origin/gh/laithsakka/320/base 2025-12-04T12:53:07.9541982Z * [new branch] gh/laithsakka/320/head -> origin/gh/laithsakka/320/head 2025-12-04T12:53:07.9542054Z * [new branch] gh/laithsakka/320/orig -> origin/gh/laithsakka/320/orig 2025-12-04T12:53:07.9542127Z * [new branch] gh/laithsakka/321/base -> origin/gh/laithsakka/321/base 2025-12-04T12:53:07.9542246Z * [new branch] gh/laithsakka/321/head -> origin/gh/laithsakka/321/head 2025-12-04T12:53:07.9542318Z * [new branch] gh/laithsakka/321/orig -> origin/gh/laithsakka/321/orig 2025-12-04T12:53:07.9542395Z * [new branch] gh/laithsakka/322/base -> origin/gh/laithsakka/322/base 2025-12-04T12:53:07.9542470Z * [new branch] gh/laithsakka/322/head -> origin/gh/laithsakka/322/head 2025-12-04T12:53:07.9542544Z * [new branch] gh/laithsakka/322/orig -> origin/gh/laithsakka/322/orig 2025-12-04T12:53:07.9542618Z * [new branch] gh/laithsakka/323/base -> origin/gh/laithsakka/323/base 2025-12-04T12:53:07.9542690Z * [new branch] gh/laithsakka/323/head -> origin/gh/laithsakka/323/head 2025-12-04T12:53:07.9542761Z * [new branch] gh/laithsakka/323/orig -> origin/gh/laithsakka/323/orig 2025-12-04T12:53:07.9542834Z * [new branch] gh/laithsakka/324/base -> origin/gh/laithsakka/324/base 2025-12-04T12:53:07.9542907Z * [new branch] gh/laithsakka/324/head -> origin/gh/laithsakka/324/head 2025-12-04T12:53:07.9542979Z * [new branch] gh/laithsakka/324/orig -> origin/gh/laithsakka/324/orig 2025-12-04T12:53:07.9543053Z * [new branch] gh/laithsakka/325/base -> origin/gh/laithsakka/325/base 2025-12-04T12:53:07.9543150Z * [new branch] gh/laithsakka/325/head -> origin/gh/laithsakka/325/head 2025-12-04T12:53:07.9543223Z * [new branch] gh/laithsakka/325/orig -> origin/gh/laithsakka/325/orig 2025-12-04T12:53:07.9543297Z * [new branch] gh/laithsakka/326/base -> origin/gh/laithsakka/326/base 2025-12-04T12:53:07.9543368Z * [new branch] gh/laithsakka/326/head -> origin/gh/laithsakka/326/head 2025-12-04T12:53:07.9543442Z * [new branch] gh/laithsakka/326/orig -> origin/gh/laithsakka/326/orig 2025-12-04T12:53:07.9543516Z * [new branch] gh/laithsakka/327/base -> origin/gh/laithsakka/327/base 2025-12-04T12:53:07.9543589Z * [new branch] gh/laithsakka/327/head -> origin/gh/laithsakka/327/head 2025-12-04T12:53:07.9543664Z * [new branch] gh/laithsakka/327/orig -> origin/gh/laithsakka/327/orig 2025-12-04T12:53:07.9543735Z * [new branch] gh/laithsakka/328/base -> origin/gh/laithsakka/328/base 2025-12-04T12:53:07.9543810Z * [new branch] gh/laithsakka/328/head -> origin/gh/laithsakka/328/head 2025-12-04T12:53:07.9543885Z * [new branch] gh/laithsakka/328/orig -> origin/gh/laithsakka/328/orig 2025-12-04T12:53:07.9543954Z * [new branch] gh/liangel/4/base -> origin/gh/liangel/4/base 2025-12-04T12:53:07.9544022Z * [new branch] gh/liangel/4/head -> origin/gh/liangel/4/head 2025-12-04T12:53:07.9544090Z * [new branch] gh/liangel/4/orig -> origin/gh/liangel/4/orig 2025-12-04T12:53:07.9544164Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-12-04T12:53:07.9544239Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-12-04T12:53:07.9544303Z * [new branch] gh/lw/4/base -> origin/gh/lw/4/base 2025-12-04T12:53:07.9544365Z * [new branch] gh/lw/4/head -> origin/gh/lw/4/head 2025-12-04T12:53:07.9544428Z * [new branch] gh/lw/4/orig -> origin/gh/lw/4/orig 2025-12-04T12:53:07.9544489Z * [new branch] gh/lw/5/base -> origin/gh/lw/5/base 2025-12-04T12:53:07.9544549Z * [new branch] gh/lw/5/head -> origin/gh/lw/5/head 2025-12-04T12:53:07.9544609Z * [new branch] gh/lw/5/orig -> origin/gh/lw/5/orig 2025-12-04T12:53:07.9544670Z * [new branch] gh/lw/6/base -> origin/gh/lw/6/base 2025-12-04T12:53:07.9544729Z * [new branch] gh/lw/6/head -> origin/gh/lw/6/head 2025-12-04T12:53:07.9544817Z * [new branch] gh/lw/6/orig -> origin/gh/lw/6/orig 2025-12-04T12:53:07.9544886Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-12-04T12:53:07.9544957Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-12-04T12:53:07.9545025Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-12-04T12:53:07.9545096Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-12-04T12:53:07.9545163Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-12-04T12:53:07.9545232Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-12-04T12:53:07.9545299Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-12-04T12:53:07.9545364Z * [new branch] gh/malfet/517/base -> origin/gh/malfet/517/base 2025-12-04T12:53:07.9545433Z * [new branch] gh/malfet/517/head -> origin/gh/malfet/517/head 2025-12-04T12:53:07.9545499Z * [new branch] gh/malfet/528/base -> origin/gh/malfet/528/base 2025-12-04T12:53:07.9545565Z * [new branch] gh/malfet/528/head -> origin/gh/malfet/528/head 2025-12-04T12:53:07.9545632Z * [new branch] gh/malfet/528/orig -> origin/gh/malfet/528/orig 2025-12-04T12:53:07.9545723Z * [new branch] gh/malfet/537/base -> origin/gh/malfet/537/base 2025-12-04T12:53:07.9545791Z * [new branch] gh/malfet/537/head -> origin/gh/malfet/537/head 2025-12-04T12:53:07.9545859Z * [new branch] gh/malfet/537/orig -> origin/gh/malfet/537/orig 2025-12-04T12:53:07.9545925Z * [new branch] gh/malfet/546/base -> origin/gh/malfet/546/base 2025-12-04T12:53:07.9545991Z * [new branch] gh/malfet/546/head -> origin/gh/malfet/546/head 2025-12-04T12:53:07.9546061Z * [new branch] gh/malfet/546/orig -> origin/gh/malfet/546/orig 2025-12-04T12:53:07.9546127Z * [new branch] gh/malfet/565/base -> origin/gh/malfet/565/base 2025-12-04T12:53:07.9546193Z * [new branch] gh/malfet/565/head -> origin/gh/malfet/565/head 2025-12-04T12:53:07.9546260Z * [new branch] gh/malfet/565/orig -> origin/gh/malfet/565/orig 2025-12-04T12:53:07.9546328Z * [new branch] gh/malfet/575/base -> origin/gh/malfet/575/base 2025-12-04T12:53:07.9546394Z * [new branch] gh/malfet/575/head -> origin/gh/malfet/575/head 2025-12-04T12:53:07.9546461Z * [new branch] gh/malfet/575/orig -> origin/gh/malfet/575/orig 2025-12-04T12:53:07.9546527Z * [new branch] gh/malfet/580/base -> origin/gh/malfet/580/base 2025-12-04T12:53:07.9546594Z * [new branch] gh/malfet/580/head -> origin/gh/malfet/580/head 2025-12-04T12:53:07.9546664Z * [new branch] gh/malfet/580/orig -> origin/gh/malfet/580/orig 2025-12-04T12:53:07.9546731Z * [new branch] gh/malfet/581/base -> origin/gh/malfet/581/base 2025-12-04T12:53:07.9546799Z * [new branch] gh/malfet/581/head -> origin/gh/malfet/581/head 2025-12-04T12:53:07.9546865Z * [new branch] gh/malfet/581/orig -> origin/gh/malfet/581/orig 2025-12-04T12:53:07.9546932Z * [new branch] gh/malfet/583/base -> origin/gh/malfet/583/base 2025-12-04T12:53:07.9546999Z * [new branch] gh/malfet/583/head -> origin/gh/malfet/583/head 2025-12-04T12:53:07.9547065Z * [new branch] gh/malfet/583/orig -> origin/gh/malfet/583/orig 2025-12-04T12:53:07.9547131Z * [new branch] gh/malfet/586/base -> origin/gh/malfet/586/base 2025-12-04T12:53:07.9547198Z * [new branch] gh/malfet/586/head -> origin/gh/malfet/586/head 2025-12-04T12:53:07.9547295Z * [new branch] gh/malfet/586/orig -> origin/gh/malfet/586/orig 2025-12-04T12:53:07.9547361Z * [new branch] gh/malfet/587/base -> origin/gh/malfet/587/base 2025-12-04T12:53:07.9547429Z * [new branch] gh/malfet/587/head -> origin/gh/malfet/587/head 2025-12-04T12:53:07.9547494Z * [new branch] gh/malfet/587/orig -> origin/gh/malfet/587/orig 2025-12-04T12:53:07.9547562Z * [new branch] gh/malfet/588/base -> origin/gh/malfet/588/base 2025-12-04T12:53:07.9547630Z * [new branch] gh/malfet/588/head -> origin/gh/malfet/588/head 2025-12-04T12:53:07.9547696Z * [new branch] gh/malfet/588/orig -> origin/gh/malfet/588/orig 2025-12-04T12:53:07.9547762Z * [new branch] gh/malfet/589/base -> origin/gh/malfet/589/base 2025-12-04T12:53:07.9547829Z * [new branch] gh/malfet/589/head -> origin/gh/malfet/589/head 2025-12-04T12:53:07.9547898Z * [new branch] gh/malfet/589/orig -> origin/gh/malfet/589/orig 2025-12-04T12:53:07.9547965Z * [new branch] gh/malfet/590/base -> origin/gh/malfet/590/base 2025-12-04T12:53:07.9548032Z * [new branch] gh/malfet/590/head -> origin/gh/malfet/590/head 2025-12-04T12:53:07.9548099Z * [new branch] gh/malfet/590/orig -> origin/gh/malfet/590/orig 2025-12-04T12:53:07.9548190Z * [new branch] gh/malfet/591/base -> origin/gh/malfet/591/base 2025-12-04T12:53:07.9548258Z * [new branch] gh/malfet/591/head -> origin/gh/malfet/591/head 2025-12-04T12:53:07.9548323Z * [new branch] gh/malfet/591/orig -> origin/gh/malfet/591/orig 2025-12-04T12:53:07.9548389Z * [new branch] gh/malfet/592/base -> origin/gh/malfet/592/base 2025-12-04T12:53:07.9548456Z * [new branch] gh/malfet/592/head -> origin/gh/malfet/592/head 2025-12-04T12:53:07.9548522Z * [new branch] gh/malfet/592/orig -> origin/gh/malfet/592/orig 2025-12-04T12:53:07.9548591Z * [new branch] gh/malfet/593/base -> origin/gh/malfet/593/base 2025-12-04T12:53:07.9548658Z * [new branch] gh/malfet/593/head -> origin/gh/malfet/593/head 2025-12-04T12:53:07.9548724Z * [new branch] gh/malfet/593/orig -> origin/gh/malfet/593/orig 2025-12-04T12:53:07.9548793Z * [new branch] gh/malfet/594/base -> origin/gh/malfet/594/base 2025-12-04T12:53:07.9548860Z * [new branch] gh/malfet/594/head -> origin/gh/malfet/594/head 2025-12-04T12:53:07.9548927Z * [new branch] gh/malfet/594/orig -> origin/gh/malfet/594/orig 2025-12-04T12:53:07.9548995Z * [new branch] gh/malfet/595/base -> origin/gh/malfet/595/base 2025-12-04T12:53:07.9549061Z * [new branch] gh/malfet/595/head -> origin/gh/malfet/595/head 2025-12-04T12:53:07.9549127Z * [new branch] gh/malfet/595/orig -> origin/gh/malfet/595/orig 2025-12-04T12:53:07.9549196Z * [new branch] gh/malfet/596/base -> origin/gh/malfet/596/base 2025-12-04T12:53:07.9549263Z * [new branch] gh/malfet/596/head -> origin/gh/malfet/596/head 2025-12-04T12:53:07.9549329Z * [new branch] gh/malfet/596/orig -> origin/gh/malfet/596/orig 2025-12-04T12:53:07.9549399Z * [new branch] gh/malfet/597/base -> origin/gh/malfet/597/base 2025-12-04T12:53:07.9549465Z * [new branch] gh/malfet/597/head -> origin/gh/malfet/597/head 2025-12-04T12:53:07.9549531Z * [new branch] gh/malfet/597/orig -> origin/gh/malfet/597/orig 2025-12-04T12:53:07.9549599Z * [new branch] gh/malfet/598/base -> origin/gh/malfet/598/base 2025-12-04T12:53:07.9549664Z * [new branch] gh/malfet/598/head -> origin/gh/malfet/598/head 2025-12-04T12:53:07.9549731Z * [new branch] gh/malfet/598/orig -> origin/gh/malfet/598/orig 2025-12-04T12:53:07.9549828Z * [new branch] gh/malfet/599/base -> origin/gh/malfet/599/base 2025-12-04T12:53:07.9549894Z * [new branch] gh/malfet/599/head -> origin/gh/malfet/599/head 2025-12-04T12:53:07.9549960Z * [new branch] gh/malfet/599/orig -> origin/gh/malfet/599/orig 2025-12-04T12:53:07.9550030Z * [new branch] gh/malfet/600/base -> origin/gh/malfet/600/base 2025-12-04T12:53:07.9550096Z * [new branch] gh/malfet/600/head -> origin/gh/malfet/600/head 2025-12-04T12:53:07.9550163Z * [new branch] gh/malfet/600/orig -> origin/gh/malfet/600/orig 2025-12-04T12:53:07.9550273Z * [new branch] gh/malfet/601/base -> origin/gh/malfet/601/base 2025-12-04T12:53:07.9550339Z * [new branch] gh/malfet/601/head -> origin/gh/malfet/601/head 2025-12-04T12:53:07.9550409Z * [new branch] gh/malfet/601/orig -> origin/gh/malfet/601/orig 2025-12-04T12:53:07.9550477Z * [new branch] gh/malfet/602/base -> origin/gh/malfet/602/base 2025-12-04T12:53:07.9550544Z * [new branch] gh/malfet/602/head -> origin/gh/malfet/602/head 2025-12-04T12:53:07.9550611Z * [new branch] gh/malfet/602/orig -> origin/gh/malfet/602/orig 2025-12-04T12:53:07.9550718Z * [new branch] gh/malfet/603/base -> origin/gh/malfet/603/base 2025-12-04T12:53:07.9550786Z * [new branch] gh/malfet/603/head -> origin/gh/malfet/603/head 2025-12-04T12:53:07.9550852Z * [new branch] gh/malfet/603/orig -> origin/gh/malfet/603/orig 2025-12-04T12:53:07.9550919Z * [new branch] gh/malfet/604/base -> origin/gh/malfet/604/base 2025-12-04T12:53:07.9550985Z * [new branch] gh/malfet/604/head -> origin/gh/malfet/604/head 2025-12-04T12:53:07.9551052Z * [new branch] gh/malfet/604/orig -> origin/gh/malfet/604/orig 2025-12-04T12:53:07.9551120Z * [new branch] gh/malfet/605/base -> origin/gh/malfet/605/base 2025-12-04T12:53:07.9551186Z * [new branch] gh/malfet/605/head -> origin/gh/malfet/605/head 2025-12-04T12:53:07.9551254Z * [new branch] gh/malfet/605/orig -> origin/gh/malfet/605/orig 2025-12-04T12:53:07.9551322Z * [new branch] gh/malfet/606/base -> origin/gh/malfet/606/base 2025-12-04T12:53:07.9551387Z * [new branch] gh/malfet/606/head -> origin/gh/malfet/606/head 2025-12-04T12:53:07.9551455Z * [new branch] gh/malfet/606/orig -> origin/gh/malfet/606/orig 2025-12-04T12:53:07.9551521Z * [new branch] gh/malfet/607/base -> origin/gh/malfet/607/base 2025-12-04T12:53:07.9551588Z * [new branch] gh/malfet/607/head -> origin/gh/malfet/607/head 2025-12-04T12:53:07.9551655Z * [new branch] gh/malfet/607/orig -> origin/gh/malfet/607/orig 2025-12-04T12:53:07.9551722Z * [new branch] gh/malfet/608/base -> origin/gh/malfet/608/base 2025-12-04T12:53:07.9551789Z * [new branch] gh/malfet/608/head -> origin/gh/malfet/608/head 2025-12-04T12:53:07.9551855Z * [new branch] gh/malfet/608/orig -> origin/gh/malfet/608/orig 2025-12-04T12:53:07.9551922Z * [new branch] gh/malfet/609/base -> origin/gh/malfet/609/base 2025-12-04T12:53:07.9551989Z * [new branch] gh/malfet/609/head -> origin/gh/malfet/609/head 2025-12-04T12:53:07.9552056Z * [new branch] gh/malfet/609/orig -> origin/gh/malfet/609/orig 2025-12-04T12:53:07.9552122Z * [new branch] gh/malfet/610/base -> origin/gh/malfet/610/base 2025-12-04T12:53:07.9552190Z * [new branch] gh/malfet/610/head -> origin/gh/malfet/610/head 2025-12-04T12:53:07.9552256Z * [new branch] gh/malfet/610/orig -> origin/gh/malfet/610/orig 2025-12-04T12:53:07.9552373Z * [new branch] gh/malfet/611/base -> origin/gh/malfet/611/base 2025-12-04T12:53:07.9552442Z * [new branch] gh/malfet/611/head -> origin/gh/malfet/611/head 2025-12-04T12:53:07.9552508Z * [new branch] gh/malfet/611/orig -> origin/gh/malfet/611/orig 2025-12-04T12:53:07.9552576Z * [new branch] gh/malfet/612/base -> origin/gh/malfet/612/base 2025-12-04T12:53:07.9552644Z * [new branch] gh/malfet/612/head -> origin/gh/malfet/612/head 2025-12-04T12:53:07.9552710Z * [new branch] gh/malfet/612/orig -> origin/gh/malfet/612/orig 2025-12-04T12:53:07.9552777Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-12-04T12:53:07.9552844Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-12-04T12:53:07.9552933Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-12-04T12:53:07.9553019Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-12-04T12:53:07.9553102Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-12-04T12:53:07.9553170Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-12-04T12:53:07.9553265Z * [new branch] gh/masnesral/1/base -> origin/gh/masnesral/1/base 2025-12-04T12:53:07.9553339Z * [new branch] gh/masnesral/1/head -> origin/gh/masnesral/1/head 2025-12-04T12:53:07.9553409Z * [new branch] gh/masnesral/1/orig -> origin/gh/masnesral/1/orig 2025-12-04T12:53:07.9553480Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-12-04T12:53:07.9553550Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-12-04T12:53:07.9553618Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-12-04T12:53:07.9553689Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-12-04T12:53:07.9553758Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-12-04T12:53:07.9553826Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-12-04T12:53:07.9553898Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-12-04T12:53:07.9553966Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-12-04T12:53:07.9554035Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-12-04T12:53:07.9554105Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-12-04T12:53:07.9554173Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-12-04T12:53:07.9554243Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-12-04T12:53:07.9554314Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-12-04T12:53:07.9554381Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-12-04T12:53:07.9554481Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-12-04T12:53:07.9554579Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-12-04T12:53:07.9554670Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-12-04T12:53:07.9554762Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-12-04T12:53:07.9554853Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-12-04T12:53:07.9554943Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-12-04T12:53:07.9555062Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-12-04T12:53:07.9555154Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-12-04T12:53:07.9555245Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-12-04T12:53:07.9555337Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-12-04T12:53:07.9555427Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-12-04T12:53:07.9555517Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-12-04T12:53:07.9555607Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-12-04T12:53:07.9555696Z * [new branch] gh/mikaylagawarecki/341/base -> origin/gh/mikaylagawarecki/341/base 2025-12-04T12:53:07.9555788Z * [new branch] gh/mikaylagawarecki/341/head -> origin/gh/mikaylagawarecki/341/head 2025-12-04T12:53:07.9555879Z * [new branch] gh/mikaylagawarecki/341/orig -> origin/gh/mikaylagawarecki/341/orig 2025-12-04T12:53:07.9555969Z * [new branch] gh/mikaylagawarecki/342/base -> origin/gh/mikaylagawarecki/342/base 2025-12-04T12:53:07.9556091Z * [new branch] gh/mikaylagawarecki/342/head -> origin/gh/mikaylagawarecki/342/head 2025-12-04T12:53:07.9556183Z * [new branch] gh/mikaylagawarecki/342/orig -> origin/gh/mikaylagawarecki/342/orig 2025-12-04T12:53:07.9556275Z * [new branch] gh/mikaylagawarecki/345/base -> origin/gh/mikaylagawarecki/345/base 2025-12-04T12:53:07.9556365Z * [new branch] gh/mikaylagawarecki/345/head -> origin/gh/mikaylagawarecki/345/head 2025-12-04T12:53:07.9556458Z * [new branch] gh/mikaylagawarecki/345/orig -> origin/gh/mikaylagawarecki/345/orig 2025-12-04T12:53:07.9556550Z * [new branch] gh/mikaylagawarecki/346/base -> origin/gh/mikaylagawarecki/346/base 2025-12-04T12:53:07.9556641Z * [new branch] gh/mikaylagawarecki/346/head -> origin/gh/mikaylagawarecki/346/head 2025-12-04T12:53:07.9556731Z * [new branch] gh/mikaylagawarecki/346/orig -> origin/gh/mikaylagawarecki/346/orig 2025-12-04T12:53:07.9556822Z * [new branch] gh/mikaylagawarecki/347/base -> origin/gh/mikaylagawarecki/347/base 2025-12-04T12:53:07.9556914Z * [new branch] gh/mikaylagawarecki/347/head -> origin/gh/mikaylagawarecki/347/head 2025-12-04T12:53:07.9557003Z * [new branch] gh/mikaylagawarecki/347/orig -> origin/gh/mikaylagawarecki/347/orig 2025-12-04T12:53:07.9557093Z * [new branch] gh/mikaylagawarecki/350/base -> origin/gh/mikaylagawarecki/350/base 2025-12-04T12:53:07.9557185Z * [new branch] gh/mikaylagawarecki/350/head -> origin/gh/mikaylagawarecki/350/head 2025-12-04T12:53:07.9557276Z * [new branch] gh/mikaylagawarecki/350/orig -> origin/gh/mikaylagawarecki/350/orig 2025-12-04T12:53:07.9557365Z * [new branch] gh/mikaylagawarecki/351/base -> origin/gh/mikaylagawarecki/351/base 2025-12-04T12:53:07.9557456Z * [new branch] gh/mikaylagawarecki/351/head -> origin/gh/mikaylagawarecki/351/head 2025-12-04T12:53:07.9557546Z * [new branch] gh/mikaylagawarecki/351/orig -> origin/gh/mikaylagawarecki/351/orig 2025-12-04T12:53:07.9557636Z * [new branch] gh/mikaylagawarecki/352/base -> origin/gh/mikaylagawarecki/352/base 2025-12-04T12:53:07.9557727Z * [new branch] gh/mikaylagawarecki/352/head -> origin/gh/mikaylagawarecki/352/head 2025-12-04T12:53:07.9557817Z * [new branch] gh/mikaylagawarecki/352/orig -> origin/gh/mikaylagawarecki/352/orig 2025-12-04T12:53:07.9557908Z * [new branch] gh/mikaylagawarecki/353/base -> origin/gh/mikaylagawarecki/353/base 2025-12-04T12:53:07.9558029Z * [new branch] gh/mikaylagawarecki/353/head -> origin/gh/mikaylagawarecki/353/head 2025-12-04T12:53:07.9558119Z * [new branch] gh/mikaylagawarecki/353/orig -> origin/gh/mikaylagawarecki/353/orig 2025-12-04T12:53:07.9558210Z * [new branch] gh/mikaylagawarecki/354/base -> origin/gh/mikaylagawarecki/354/base 2025-12-04T12:53:07.9558303Z * [new branch] gh/mikaylagawarecki/354/head -> origin/gh/mikaylagawarecki/354/head 2025-12-04T12:53:07.9558394Z * [new branch] gh/mikaylagawarecki/354/orig -> origin/gh/mikaylagawarecki/354/orig 2025-12-04T12:53:07.9558485Z * [new branch] gh/mikaylagawarecki/356/base -> origin/gh/mikaylagawarecki/356/base 2025-12-04T12:53:07.9558577Z * [new branch] gh/mikaylagawarecki/356/head -> origin/gh/mikaylagawarecki/356/head 2025-12-04T12:53:07.9558668Z * [new branch] gh/mikaylagawarecki/356/orig -> origin/gh/mikaylagawarecki/356/orig 2025-12-04T12:53:07.9558763Z * [new branch] gh/mikaylagawarecki/357/base -> origin/gh/mikaylagawarecki/357/base 2025-12-04T12:53:07.9558853Z * [new branch] gh/mikaylagawarecki/357/head -> origin/gh/mikaylagawarecki/357/head 2025-12-04T12:53:07.9558943Z * [new branch] gh/mikaylagawarecki/357/orig -> origin/gh/mikaylagawarecki/357/orig 2025-12-04T12:53:07.9559060Z * [new branch] gh/mikaylagawarecki/359/base -> origin/gh/mikaylagawarecki/359/base 2025-12-04T12:53:07.9559151Z * [new branch] gh/mikaylagawarecki/359/head -> origin/gh/mikaylagawarecki/359/head 2025-12-04T12:53:07.9559241Z * [new branch] gh/mikaylagawarecki/359/orig -> origin/gh/mikaylagawarecki/359/orig 2025-12-04T12:53:07.9559332Z * [new branch] gh/mikaylagawarecki/360/base -> origin/gh/mikaylagawarecki/360/base 2025-12-04T12:53:07.9559422Z * [new branch] gh/mikaylagawarecki/360/head -> origin/gh/mikaylagawarecki/360/head 2025-12-04T12:53:07.9559514Z * [new branch] gh/mikaylagawarecki/360/orig -> origin/gh/mikaylagawarecki/360/orig 2025-12-04T12:53:07.9559603Z * [new branch] gh/mikaylagawarecki/361/base -> origin/gh/mikaylagawarecki/361/base 2025-12-04T12:53:07.9559693Z * [new branch] gh/mikaylagawarecki/361/head -> origin/gh/mikaylagawarecki/361/head 2025-12-04T12:53:07.9559788Z * [new branch] gh/mikaylagawarecki/361/orig -> origin/gh/mikaylagawarecki/361/orig 2025-12-04T12:53:07.9559878Z * [new branch] gh/mikaylagawarecki/362/base -> origin/gh/mikaylagawarecki/362/base 2025-12-04T12:53:07.9559969Z * [new branch] gh/mikaylagawarecki/362/head -> origin/gh/mikaylagawarecki/362/head 2025-12-04T12:53:07.9560061Z * [new branch] gh/mikaylagawarecki/362/orig -> origin/gh/mikaylagawarecki/362/orig 2025-12-04T12:53:07.9560151Z * [new branch] gh/mikaylagawarecki/363/base -> origin/gh/mikaylagawarecki/363/base 2025-12-04T12:53:07.9560287Z * [new branch] gh/mikaylagawarecki/363/head -> origin/gh/mikaylagawarecki/363/head 2025-12-04T12:53:07.9560382Z * [new branch] gh/mikaylagawarecki/363/orig -> origin/gh/mikaylagawarecki/363/orig 2025-12-04T12:53:07.9560473Z * [new branch] gh/mikaylagawarecki/364/base -> origin/gh/mikaylagawarecki/364/base 2025-12-04T12:53:07.9560564Z * [new branch] gh/mikaylagawarecki/364/head -> origin/gh/mikaylagawarecki/364/head 2025-12-04T12:53:07.9560656Z * [new branch] gh/mikaylagawarecki/364/orig -> origin/gh/mikaylagawarecki/364/orig 2025-12-04T12:53:07.9560747Z * [new branch] gh/mikaylagawarecki/365/base -> origin/gh/mikaylagawarecki/365/base 2025-12-04T12:53:07.9560838Z * [new branch] gh/mikaylagawarecki/365/head -> origin/gh/mikaylagawarecki/365/head 2025-12-04T12:53:07.9560927Z * [new branch] gh/mikaylagawarecki/365/orig -> origin/gh/mikaylagawarecki/365/orig 2025-12-04T12:53:07.9561061Z * [new branch] gh/mikaylagawarecki/366/base -> origin/gh/mikaylagawarecki/366/base 2025-12-04T12:53:07.9561154Z * [new branch] gh/mikaylagawarecki/366/head -> origin/gh/mikaylagawarecki/366/head 2025-12-04T12:53:07.9561244Z * [new branch] gh/mikaylagawarecki/366/orig -> origin/gh/mikaylagawarecki/366/orig 2025-12-04T12:53:07.9561335Z * [new branch] gh/mikaylagawarecki/367/base -> origin/gh/mikaylagawarecki/367/base 2025-12-04T12:53:07.9561426Z * [new branch] gh/mikaylagawarecki/367/head -> origin/gh/mikaylagawarecki/367/head 2025-12-04T12:53:07.9561517Z * [new branch] gh/mikaylagawarecki/367/orig -> origin/gh/mikaylagawarecki/367/orig 2025-12-04T12:53:07.9561607Z * [new branch] gh/mikaylagawarecki/368/base -> origin/gh/mikaylagawarecki/368/base 2025-12-04T12:53:07.9561699Z * [new branch] gh/mikaylagawarecki/368/head -> origin/gh/mikaylagawarecki/368/head 2025-12-04T12:53:07.9561791Z * [new branch] gh/mikaylagawarecki/368/orig -> origin/gh/mikaylagawarecki/368/orig 2025-12-04T12:53:07.9561881Z * [new branch] gh/mikaylagawarecki/369/base -> origin/gh/mikaylagawarecki/369/base 2025-12-04T12:53:07.9561972Z * [new branch] gh/mikaylagawarecki/369/head -> origin/gh/mikaylagawarecki/369/head 2025-12-04T12:53:07.9562104Z * [new branch] gh/mikaylagawarecki/369/orig -> origin/gh/mikaylagawarecki/369/orig 2025-12-04T12:53:07.9562196Z * [new branch] gh/mikaylagawarecki/370/base -> origin/gh/mikaylagawarecki/370/base 2025-12-04T12:53:07.9562286Z * [new branch] gh/mikaylagawarecki/370/head -> origin/gh/mikaylagawarecki/370/head 2025-12-04T12:53:07.9562375Z * [new branch] gh/mikaylagawarecki/370/orig -> origin/gh/mikaylagawarecki/370/orig 2025-12-04T12:53:07.9562468Z * [new branch] gh/mikaylagawarecki/371/base -> origin/gh/mikaylagawarecki/371/base 2025-12-04T12:53:07.9562560Z * [new branch] gh/mikaylagawarecki/371/head -> origin/gh/mikaylagawarecki/371/head 2025-12-04T12:53:07.9562651Z * [new branch] gh/mikaylagawarecki/371/orig -> origin/gh/mikaylagawarecki/371/orig 2025-12-04T12:53:07.9562743Z * [new branch] gh/mikaylagawarecki/372/base -> origin/gh/mikaylagawarecki/372/base 2025-12-04T12:53:07.9562835Z * [new branch] gh/mikaylagawarecki/372/head -> origin/gh/mikaylagawarecki/372/head 2025-12-04T12:53:07.9562925Z * [new branch] gh/mikaylagawarecki/372/orig -> origin/gh/mikaylagawarecki/372/orig 2025-12-04T12:53:07.9563019Z * [new branch] gh/mikaylagawarecki/373/base -> origin/gh/mikaylagawarecki/373/base 2025-12-04T12:53:07.9563109Z * [new branch] gh/mikaylagawarecki/373/head -> origin/gh/mikaylagawarecki/373/head 2025-12-04T12:53:07.9563201Z * [new branch] gh/mikaylagawarecki/373/orig -> origin/gh/mikaylagawarecki/373/orig 2025-12-04T12:53:07.9563294Z * [new branch] gh/mikaylagawarecki/374/base -> origin/gh/mikaylagawarecki/374/base 2025-12-04T12:53:07.9563385Z * [new branch] gh/mikaylagawarecki/374/head -> origin/gh/mikaylagawarecki/374/head 2025-12-04T12:53:07.9563474Z * [new branch] gh/mikaylagawarecki/374/orig -> origin/gh/mikaylagawarecki/374/orig 2025-12-04T12:53:07.9563567Z * [new branch] gh/mikaylagawarecki/375/base -> origin/gh/mikaylagawarecki/375/base 2025-12-04T12:53:07.9563657Z * [new branch] gh/mikaylagawarecki/375/head -> origin/gh/mikaylagawarecki/375/head 2025-12-04T12:53:07.9563749Z * [new branch] gh/mikaylagawarecki/375/orig -> origin/gh/mikaylagawarecki/375/orig 2025-12-04T12:53:07.9563839Z * [new branch] gh/mikaylagawarecki/376/base -> origin/gh/mikaylagawarecki/376/base 2025-12-04T12:53:07.9563931Z * [new branch] gh/mikaylagawarecki/376/head -> origin/gh/mikaylagawarecki/376/head 2025-12-04T12:53:07.9564050Z * [new branch] gh/mikaylagawarecki/376/orig -> origin/gh/mikaylagawarecki/376/orig 2025-12-04T12:53:07.9564141Z * [new branch] gh/mikaylagawarecki/377/base -> origin/gh/mikaylagawarecki/377/base 2025-12-04T12:53:07.9564231Z * [new branch] gh/mikaylagawarecki/377/head -> origin/gh/mikaylagawarecki/377/head 2025-12-04T12:53:07.9564323Z * [new branch] gh/mikaylagawarecki/377/orig -> origin/gh/mikaylagawarecki/377/orig 2025-12-04T12:53:07.9564413Z * [new branch] gh/mikaylagawarecki/378/base -> origin/gh/mikaylagawarecki/378/base 2025-12-04T12:53:07.9564503Z * [new branch] gh/mikaylagawarecki/378/head -> origin/gh/mikaylagawarecki/378/head 2025-12-04T12:53:07.9564597Z * [new branch] gh/mikaylagawarecki/378/orig -> origin/gh/mikaylagawarecki/378/orig 2025-12-04T12:53:07.9564687Z * [new branch] gh/mikaylagawarecki/379/base -> origin/gh/mikaylagawarecki/379/base 2025-12-04T12:53:07.9564779Z * [new branch] gh/mikaylagawarecki/379/head -> origin/gh/mikaylagawarecki/379/head 2025-12-04T12:53:07.9564870Z * [new branch] gh/mikaylagawarecki/379/orig -> origin/gh/mikaylagawarecki/379/orig 2025-12-04T12:53:07.9564960Z * [new branch] gh/mikaylagawarecki/380/base -> origin/gh/mikaylagawarecki/380/base 2025-12-04T12:53:07.9565077Z * [new branch] gh/mikaylagawarecki/380/head -> origin/gh/mikaylagawarecki/380/head 2025-12-04T12:53:07.9565168Z * [new branch] gh/mikaylagawarecki/380/orig -> origin/gh/mikaylagawarecki/380/orig 2025-12-04T12:53:07.9565258Z * [new branch] gh/mikaylagawarecki/381/base -> origin/gh/mikaylagawarecki/381/base 2025-12-04T12:53:07.9565350Z * [new branch] gh/mikaylagawarecki/381/head -> origin/gh/mikaylagawarecki/381/head 2025-12-04T12:53:07.9565439Z * [new branch] gh/mikaylagawarecki/381/orig -> origin/gh/mikaylagawarecki/381/orig 2025-12-04T12:53:07.9565530Z * [new branch] gh/mikaylagawarecki/382/base -> origin/gh/mikaylagawarecki/382/base 2025-12-04T12:53:07.9565621Z * [new branch] gh/mikaylagawarecki/382/head -> origin/gh/mikaylagawarecki/382/head 2025-12-04T12:53:07.9565713Z * [new branch] gh/mikaylagawarecki/382/orig -> origin/gh/mikaylagawarecki/382/orig 2025-12-04T12:53:07.9565804Z * [new branch] gh/mikaylagawarecki/383/base -> origin/gh/mikaylagawarecki/383/base 2025-12-04T12:53:07.9565895Z * [new branch] gh/mikaylagawarecki/383/head -> origin/gh/mikaylagawarecki/383/head 2025-12-04T12:53:07.9565985Z * [new branch] gh/mikaylagawarecki/383/orig -> origin/gh/mikaylagawarecki/383/orig 2025-12-04T12:53:07.9566074Z * [new branch] gh/mikaylagawarecki/384/base -> origin/gh/mikaylagawarecki/384/base 2025-12-04T12:53:07.9566165Z * [new branch] gh/mikaylagawarecki/384/head -> origin/gh/mikaylagawarecki/384/head 2025-12-04T12:53:07.9566258Z * [new branch] gh/mikaylagawarecki/384/orig -> origin/gh/mikaylagawarecki/384/orig 2025-12-04T12:53:07.9566350Z * [new branch] gh/mikaylagawarecki/385/base -> origin/gh/mikaylagawarecki/385/base 2025-12-04T12:53:07.9566444Z * [new branch] gh/mikaylagawarecki/385/head -> origin/gh/mikaylagawarecki/385/head 2025-12-04T12:53:07.9566536Z * [new branch] gh/mikaylagawarecki/385/orig -> origin/gh/mikaylagawarecki/385/orig 2025-12-04T12:53:07.9566629Z * [new branch] gh/mikaylagawarecki/386/base -> origin/gh/mikaylagawarecki/386/base 2025-12-04T12:53:07.9566719Z * [new branch] gh/mikaylagawarecki/386/head -> origin/gh/mikaylagawarecki/386/head 2025-12-04T12:53:07.9566808Z * [new branch] gh/mikaylagawarecki/386/orig -> origin/gh/mikaylagawarecki/386/orig 2025-12-04T12:53:07.9566899Z * [new branch] gh/mikaylagawarecki/387/base -> origin/gh/mikaylagawarecki/387/base 2025-12-04T12:53:07.9567024Z * [new branch] gh/mikaylagawarecki/387/head -> origin/gh/mikaylagawarecki/387/head 2025-12-04T12:53:07.9567114Z * [new branch] gh/mikaylagawarecki/387/orig -> origin/gh/mikaylagawarecki/387/orig 2025-12-04T12:53:07.9567205Z * [new branch] gh/mikaylagawarecki/388/base -> origin/gh/mikaylagawarecki/388/base 2025-12-04T12:53:07.9567296Z * [new branch] gh/mikaylagawarecki/388/head -> origin/gh/mikaylagawarecki/388/head 2025-12-04T12:53:07.9567386Z * [new branch] gh/mikaylagawarecki/388/orig -> origin/gh/mikaylagawarecki/388/orig 2025-12-04T12:53:07.9567478Z * [new branch] gh/mikaylagawarecki/389/base -> origin/gh/mikaylagawarecki/389/base 2025-12-04T12:53:07.9567568Z * [new branch] gh/mikaylagawarecki/389/head -> origin/gh/mikaylagawarecki/389/head 2025-12-04T12:53:07.9567658Z * [new branch] gh/mikaylagawarecki/389/orig -> origin/gh/mikaylagawarecki/389/orig 2025-12-04T12:53:07.9567751Z * [new branch] gh/mikaylagawarecki/390/base -> origin/gh/mikaylagawarecki/390/base 2025-12-04T12:53:07.9567842Z * [new branch] gh/mikaylagawarecki/390/head -> origin/gh/mikaylagawarecki/390/head 2025-12-04T12:53:07.9567937Z * [new branch] gh/mikaylagawarecki/390/orig -> origin/gh/mikaylagawarecki/390/orig 2025-12-04T12:53:07.9568061Z * [new branch] gh/mikaylagawarecki/391/base -> origin/gh/mikaylagawarecki/391/base 2025-12-04T12:53:07.9568152Z * [new branch] gh/mikaylagawarecki/391/head -> origin/gh/mikaylagawarecki/391/head 2025-12-04T12:53:07.9568243Z * [new branch] gh/mikaylagawarecki/391/orig -> origin/gh/mikaylagawarecki/391/orig 2025-12-04T12:53:07.9568333Z * [new branch] gh/mikaylagawarecki/392/base -> origin/gh/mikaylagawarecki/392/base 2025-12-04T12:53:07.9568423Z * [new branch] gh/mikaylagawarecki/392/head -> origin/gh/mikaylagawarecki/392/head 2025-12-04T12:53:07.9568515Z * [new branch] gh/mikaylagawarecki/392/orig -> origin/gh/mikaylagawarecki/392/orig 2025-12-04T12:53:07.9568604Z * [new branch] gh/mikaylagawarecki/393/base -> origin/gh/mikaylagawarecki/393/base 2025-12-04T12:53:07.9568696Z * [new branch] gh/mikaylagawarecki/393/head -> origin/gh/mikaylagawarecki/393/head 2025-12-04T12:53:07.9568789Z * [new branch] gh/mikaylagawarecki/393/orig -> origin/gh/mikaylagawarecki/393/orig 2025-12-04T12:53:07.9568858Z * [new branch] gh/mlazos/41/base -> origin/gh/mlazos/41/base 2025-12-04T12:53:07.9568927Z * [new branch] gh/mlazos/41/head -> origin/gh/mlazos/41/head 2025-12-04T12:53:07.9568995Z * [new branch] gh/mlazos/41/orig -> origin/gh/mlazos/41/orig 2025-12-04T12:53:07.9569061Z * [new branch] gh/mlazos/42/base -> origin/gh/mlazos/42/base 2025-12-04T12:53:07.9569126Z * [new branch] gh/mlazos/42/head -> origin/gh/mlazos/42/head 2025-12-04T12:53:07.9569194Z * [new branch] gh/mlazos/42/orig -> origin/gh/mlazos/42/orig 2025-12-04T12:53:07.9569259Z * [new branch] gh/mlazos/43/base -> origin/gh/mlazos/43/base 2025-12-04T12:53:07.9569323Z * [new branch] gh/mlazos/43/head -> origin/gh/mlazos/43/head 2025-12-04T12:53:07.9569391Z * [new branch] gh/mlazos/43/orig -> origin/gh/mlazos/43/orig 2025-12-04T12:53:07.9569457Z * [new branch] gh/mlazos/44/base -> origin/gh/mlazos/44/base 2025-12-04T12:53:07.9569524Z * [new branch] gh/mlazos/44/head -> origin/gh/mlazos/44/head 2025-12-04T12:53:07.9569589Z * [new branch] gh/mlazos/44/orig -> origin/gh/mlazos/44/orig 2025-12-04T12:53:07.9569654Z * [new branch] gh/mlazos/47/base -> origin/gh/mlazos/47/base 2025-12-04T12:53:07.9569722Z * [new branch] gh/mlazos/47/head -> origin/gh/mlazos/47/head 2025-12-04T12:53:07.9569815Z * [new branch] gh/mlazos/47/orig -> origin/gh/mlazos/47/orig 2025-12-04T12:53:07.9569881Z * [new branch] gh/mlazos/48/base -> origin/gh/mlazos/48/base 2025-12-04T12:53:07.9569947Z * [new branch] gh/mlazos/48/head -> origin/gh/mlazos/48/head 2025-12-04T12:53:07.9570014Z * [new branch] gh/mlazos/48/orig -> origin/gh/mlazos/48/orig 2025-12-04T12:53:07.9570080Z * [new branch] gh/mlazos/49/base -> origin/gh/mlazos/49/base 2025-12-04T12:53:07.9570148Z * [new branch] gh/mlazos/49/head -> origin/gh/mlazos/49/head 2025-12-04T12:53:07.9570244Z * [new branch] gh/mlazos/49/orig -> origin/gh/mlazos/49/orig 2025-12-04T12:53:07.9570311Z * [new branch] gh/mlazos/50/base -> origin/gh/mlazos/50/base 2025-12-04T12:53:07.9570377Z * [new branch] gh/mlazos/50/head -> origin/gh/mlazos/50/head 2025-12-04T12:53:07.9570446Z * [new branch] gh/mlazos/50/orig -> origin/gh/mlazos/50/orig 2025-12-04T12:53:07.9570512Z * [new branch] gh/mlazos/51/base -> origin/gh/mlazos/51/base 2025-12-04T12:53:07.9570578Z * [new branch] gh/mlazos/51/head -> origin/gh/mlazos/51/head 2025-12-04T12:53:07.9570920Z * [new branch] gh/mlazos/51/orig -> origin/gh/mlazos/51/orig 2025-12-04T12:53:07.9570987Z * [new branch] gh/mlazos/52/base -> origin/gh/mlazos/52/base 2025-12-04T12:53:07.9571055Z * [new branch] gh/mlazos/52/head -> origin/gh/mlazos/52/head 2025-12-04T12:53:07.9571121Z * [new branch] gh/mlazos/52/orig -> origin/gh/mlazos/52/orig 2025-12-04T12:53:07.9571186Z * [new branch] gh/mlazos/53/base -> origin/gh/mlazos/53/base 2025-12-04T12:53:07.9571253Z * [new branch] gh/mlazos/53/head -> origin/gh/mlazos/53/head 2025-12-04T12:53:07.9571319Z * [new branch] gh/mlazos/53/orig -> origin/gh/mlazos/53/orig 2025-12-04T12:53:07.9571384Z * [new branch] gh/mlazos/54/base -> origin/gh/mlazos/54/base 2025-12-04T12:53:07.9571450Z * [new branch] gh/mlazos/54/head -> origin/gh/mlazos/54/head 2025-12-04T12:53:07.9571516Z * [new branch] gh/mlazos/54/orig -> origin/gh/mlazos/54/orig 2025-12-04T12:53:07.9571584Z * [new branch] gh/mlazos/55/base -> origin/gh/mlazos/55/base 2025-12-04T12:53:07.9571649Z * [new branch] gh/mlazos/55/head -> origin/gh/mlazos/55/head 2025-12-04T12:53:07.9571715Z * [new branch] gh/mlazos/55/orig -> origin/gh/mlazos/55/orig 2025-12-04T12:53:07.9571781Z * [new branch] gh/mlazos/56/base -> origin/gh/mlazos/56/base 2025-12-04T12:53:07.9571846Z * [new branch] gh/mlazos/56/head -> origin/gh/mlazos/56/head 2025-12-04T12:53:07.9571912Z * [new branch] gh/mlazos/56/orig -> origin/gh/mlazos/56/orig 2025-12-04T12:53:07.9571978Z * [new branch] gh/mlazos/57/base -> origin/gh/mlazos/57/base 2025-12-04T12:53:07.9572043Z * [new branch] gh/mlazos/57/head -> origin/gh/mlazos/57/head 2025-12-04T12:53:07.9572108Z * [new branch] gh/mlazos/57/orig -> origin/gh/mlazos/57/orig 2025-12-04T12:53:07.9572174Z * [new branch] gh/mlazos/58/base -> origin/gh/mlazos/58/base 2025-12-04T12:53:07.9572240Z * [new branch] gh/mlazos/58/head -> origin/gh/mlazos/58/head 2025-12-04T12:53:07.9572305Z * [new branch] gh/mlazos/58/orig -> origin/gh/mlazos/58/orig 2025-12-04T12:53:07.9572372Z * [new branch] gh/mlazos/59/base -> origin/gh/mlazos/59/base 2025-12-04T12:53:07.9572438Z * [new branch] gh/mlazos/59/head -> origin/gh/mlazos/59/head 2025-12-04T12:53:07.9572542Z * [new branch] gh/mlazos/59/orig -> origin/gh/mlazos/59/orig 2025-12-04T12:53:07.9572610Z * [new branch] gh/mlazos/60/base -> origin/gh/mlazos/60/base 2025-12-04T12:53:07.9572675Z * [new branch] gh/mlazos/60/head -> origin/gh/mlazos/60/head 2025-12-04T12:53:07.9572741Z * [new branch] gh/mlazos/60/orig -> origin/gh/mlazos/60/orig 2025-12-04T12:53:07.9572809Z * [new branch] gh/mlazos/61/base -> origin/gh/mlazos/61/base 2025-12-04T12:53:07.9572874Z * [new branch] gh/mlazos/61/head -> origin/gh/mlazos/61/head 2025-12-04T12:53:07.9572941Z * [new branch] gh/mlazos/61/orig -> origin/gh/mlazos/61/orig 2025-12-04T12:53:07.9573007Z * [new branch] gh/mlazos/62/base -> origin/gh/mlazos/62/base 2025-12-04T12:53:07.9573072Z * [new branch] gh/mlazos/62/head -> origin/gh/mlazos/62/head 2025-12-04T12:53:07.9573140Z * [new branch] gh/mlazos/62/orig -> origin/gh/mlazos/62/orig 2025-12-04T12:53:07.9573206Z * [new branch] gh/mlazos/63/base -> origin/gh/mlazos/63/base 2025-12-04T12:53:07.9573270Z * [new branch] gh/mlazos/63/head -> origin/gh/mlazos/63/head 2025-12-04T12:53:07.9573336Z * [new branch] gh/mlazos/63/orig -> origin/gh/mlazos/63/orig 2025-12-04T12:53:07.9573441Z * [new branch] gh/mlazos/64/base -> origin/gh/mlazos/64/base 2025-12-04T12:53:07.9573507Z * [new branch] gh/mlazos/64/head -> origin/gh/mlazos/64/head 2025-12-04T12:53:07.9573574Z * [new branch] gh/mlazos/64/orig -> origin/gh/mlazos/64/orig 2025-12-04T12:53:07.9573639Z * [new branch] gh/mlazos/65/base -> origin/gh/mlazos/65/base 2025-12-04T12:53:07.9573704Z * [new branch] gh/mlazos/65/head -> origin/gh/mlazos/65/head 2025-12-04T12:53:07.9573771Z * [new branch] gh/mlazos/65/orig -> origin/gh/mlazos/65/orig 2025-12-04T12:53:07.9573839Z * [new branch] gh/mlazos/66/base -> origin/gh/mlazos/66/base 2025-12-04T12:53:07.9573905Z * [new branch] gh/mlazos/66/head -> origin/gh/mlazos/66/head 2025-12-04T12:53:07.9573973Z * [new branch] gh/mlazos/66/orig -> origin/gh/mlazos/66/orig 2025-12-04T12:53:07.9574039Z * [new branch] gh/mlazos/67/base -> origin/gh/mlazos/67/base 2025-12-04T12:53:07.9574104Z * [new branch] gh/mlazos/67/head -> origin/gh/mlazos/67/head 2025-12-04T12:53:07.9574170Z * [new branch] gh/mlazos/67/orig -> origin/gh/mlazos/67/orig 2025-12-04T12:53:07.9574235Z * [new branch] gh/mlazos/68/base -> origin/gh/mlazos/68/base 2025-12-04T12:53:07.9574301Z * [new branch] gh/mlazos/68/head -> origin/gh/mlazos/68/head 2025-12-04T12:53:07.9574368Z * [new branch] gh/mlazos/68/orig -> origin/gh/mlazos/68/orig 2025-12-04T12:53:07.9574435Z * [new branch] gh/mlazos/69/base -> origin/gh/mlazos/69/base 2025-12-04T12:53:07.9574504Z * [new branch] gh/mlazos/69/head -> origin/gh/mlazos/69/head 2025-12-04T12:53:07.9574570Z * [new branch] gh/mlazos/69/orig -> origin/gh/mlazos/69/orig 2025-12-04T12:53:07.9574635Z * [new branch] gh/mlazos/70/base -> origin/gh/mlazos/70/base 2025-12-04T12:53:07.9574701Z * [new branch] gh/mlazos/70/head -> origin/gh/mlazos/70/head 2025-12-04T12:53:07.9574767Z * [new branch] gh/mlazos/70/orig -> origin/gh/mlazos/70/orig 2025-12-04T12:53:07.9574831Z * [new branch] gh/mlazos/71/base -> origin/gh/mlazos/71/base 2025-12-04T12:53:07.9574897Z * [new branch] gh/mlazos/71/head -> origin/gh/mlazos/71/head 2025-12-04T12:53:07.9574962Z * [new branch] gh/mlazos/71/orig -> origin/gh/mlazos/71/orig 2025-12-04T12:53:07.9575066Z * [new branch] gh/mlazos/72/base -> origin/gh/mlazos/72/base 2025-12-04T12:53:07.9575133Z * [new branch] gh/mlazos/72/head -> origin/gh/mlazos/72/head 2025-12-04T12:53:07.9575198Z * [new branch] gh/mlazos/72/orig -> origin/gh/mlazos/72/orig 2025-12-04T12:53:07.9575265Z * [new branch] gh/mlazos/73/base -> origin/gh/mlazos/73/base 2025-12-04T12:53:07.9575331Z * [new branch] gh/mlazos/73/head -> origin/gh/mlazos/73/head 2025-12-04T12:53:07.9575396Z * [new branch] gh/mlazos/73/orig -> origin/gh/mlazos/73/orig 2025-12-04T12:53:07.9575463Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-12-04T12:53:07.9575530Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-12-04T12:53:07.9575603Z * [new branch] gh/muchulee8/73/base -> origin/gh/muchulee8/73/base 2025-12-04T12:53:07.9575676Z * [new branch] gh/muchulee8/73/head -> origin/gh/muchulee8/73/head 2025-12-04T12:53:07.9575748Z * [new branch] gh/muchulee8/73/orig -> origin/gh/muchulee8/73/orig 2025-12-04T12:53:07.9575832Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-12-04T12:53:07.9575937Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-12-04T12:53:07.9576020Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-12-04T12:53:07.9576099Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-12-04T12:53:07.9576178Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-12-04T12:53:07.9576256Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-12-04T12:53:07.9576334Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-12-04T12:53:07.9576414Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-12-04T12:53:07.9576492Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-12-04T12:53:07.9576570Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-12-04T12:53:07.9576651Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-12-04T12:53:07.9576729Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-12-04T12:53:07.9576807Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-12-04T12:53:07.9576888Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-12-04T12:53:07.9576966Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-12-04T12:53:07.9577045Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-12-04T12:53:07.9577124Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-12-04T12:53:07.9577202Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-12-04T12:53:07.9577281Z * [new branch] gh/naveenthangudu/7/base -> origin/gh/naveenthangudu/7/base 2025-12-04T12:53:07.9577360Z * [new branch] gh/naveenthangudu/7/head -> origin/gh/naveenthangudu/7/head 2025-12-04T12:53:07.9577439Z * [new branch] gh/naveenthangudu/7/orig -> origin/gh/naveenthangudu/7/orig 2025-12-04T12:53:07.9577516Z * [new branch] gh/naveenthangudu/8/base -> origin/gh/naveenthangudu/8/base 2025-12-04T12:53:07.9577595Z * [new branch] gh/naveenthangudu/8/head -> origin/gh/naveenthangudu/8/head 2025-12-04T12:53:07.9577674Z * [new branch] gh/naveenthangudu/8/orig -> origin/gh/naveenthangudu/8/orig 2025-12-04T12:53:07.9577784Z * [new branch] gh/naveenthangudu/9/base -> origin/gh/naveenthangudu/9/base 2025-12-04T12:53:07.9577862Z * [new branch] gh/naveenthangudu/9/head -> origin/gh/naveenthangudu/9/head 2025-12-04T12:53:07.9577941Z * [new branch] gh/naveenthangudu/9/orig -> origin/gh/naveenthangudu/9/orig 2025-12-04T12:53:07.9578016Z * [new branch] gh/nikitaved/1/base -> origin/gh/nikitaved/1/base 2025-12-04T12:53:07.9578088Z * [new branch] gh/nikitaved/1/head -> origin/gh/nikitaved/1/head 2025-12-04T12:53:07.9578159Z * [new branch] gh/nikitaved/1/orig -> origin/gh/nikitaved/1/orig 2025-12-04T12:53:07.9578233Z * [new branch] gh/nikitaved/10/base -> origin/gh/nikitaved/10/base 2025-12-04T12:53:07.9578305Z * [new branch] gh/nikitaved/10/head -> origin/gh/nikitaved/10/head 2025-12-04T12:53:07.9578378Z * [new branch] gh/nikitaved/10/orig -> origin/gh/nikitaved/10/orig 2025-12-04T12:53:07.9578450Z * [new branch] gh/nikitaved/11/base -> origin/gh/nikitaved/11/base 2025-12-04T12:53:07.9578520Z * [new branch] gh/nikitaved/11/head -> origin/gh/nikitaved/11/head 2025-12-04T12:53:07.9578591Z * [new branch] gh/nikitaved/11/orig -> origin/gh/nikitaved/11/orig 2025-12-04T12:53:07.9578694Z * [new branch] gh/nikitaved/12/base -> origin/gh/nikitaved/12/base 2025-12-04T12:53:07.9578765Z * [new branch] gh/nikitaved/12/head -> origin/gh/nikitaved/12/head 2025-12-04T12:53:07.9578835Z * [new branch] gh/nikitaved/12/orig -> origin/gh/nikitaved/12/orig 2025-12-04T12:53:07.9578907Z * [new branch] gh/nikitaved/13/base -> origin/gh/nikitaved/13/base 2025-12-04T12:53:07.9578977Z * [new branch] gh/nikitaved/13/head -> origin/gh/nikitaved/13/head 2025-12-04T12:53:07.9579048Z * [new branch] gh/nikitaved/13/orig -> origin/gh/nikitaved/13/orig 2025-12-04T12:53:07.9579119Z * [new branch] gh/nikitaved/14/base -> origin/gh/nikitaved/14/base 2025-12-04T12:53:07.9579190Z * [new branch] gh/nikitaved/14/head -> origin/gh/nikitaved/14/head 2025-12-04T12:53:07.9579262Z * [new branch] gh/nikitaved/14/orig -> origin/gh/nikitaved/14/orig 2025-12-04T12:53:07.9579333Z * [new branch] gh/nikitaved/15/base -> origin/gh/nikitaved/15/base 2025-12-04T12:53:07.9579404Z * [new branch] gh/nikitaved/15/head -> origin/gh/nikitaved/15/head 2025-12-04T12:53:07.9579476Z * [new branch] gh/nikitaved/15/orig -> origin/gh/nikitaved/15/orig 2025-12-04T12:53:07.9579546Z * [new branch] gh/nikitaved/16/base -> origin/gh/nikitaved/16/base 2025-12-04T12:53:07.9579617Z * [new branch] gh/nikitaved/16/head -> origin/gh/nikitaved/16/head 2025-12-04T12:53:07.9579689Z * [new branch] gh/nikitaved/16/orig -> origin/gh/nikitaved/16/orig 2025-12-04T12:53:07.9579760Z * [new branch] gh/nikitaved/2/base -> origin/gh/nikitaved/2/base 2025-12-04T12:53:07.9579830Z * [new branch] gh/nikitaved/2/head -> origin/gh/nikitaved/2/head 2025-12-04T12:53:07.9579900Z * [new branch] gh/nikitaved/2/orig -> origin/gh/nikitaved/2/orig 2025-12-04T12:53:07.9579971Z * [new branch] gh/nikitaved/4/base -> origin/gh/nikitaved/4/base 2025-12-04T12:53:07.9580041Z * [new branch] gh/nikitaved/4/head -> origin/gh/nikitaved/4/head 2025-12-04T12:53:07.9580112Z * [new branch] gh/nikitaved/4/orig -> origin/gh/nikitaved/4/orig 2025-12-04T12:53:07.9580231Z * [new branch] gh/nikitaved/5/base -> origin/gh/nikitaved/5/base 2025-12-04T12:53:07.9580302Z * [new branch] gh/nikitaved/5/head -> origin/gh/nikitaved/5/head 2025-12-04T12:53:07.9580419Z * [new branch] gh/nikitaved/5/orig -> origin/gh/nikitaved/5/orig 2025-12-04T12:53:07.9580487Z * [new branch] gh/nikitaved/6/base -> origin/gh/nikitaved/6/base 2025-12-04T12:53:07.9580556Z * [new branch] gh/nikitaved/6/head -> origin/gh/nikitaved/6/head 2025-12-04T12:53:07.9580626Z * [new branch] gh/nikitaved/6/orig -> origin/gh/nikitaved/6/orig 2025-12-04T12:53:07.9580696Z * [new branch] gh/nikitaved/8/base -> origin/gh/nikitaved/8/base 2025-12-04T12:53:07.9580766Z * [new branch] gh/nikitaved/8/head -> origin/gh/nikitaved/8/head 2025-12-04T12:53:07.9580836Z * [new branch] gh/nikitaved/8/orig -> origin/gh/nikitaved/8/orig 2025-12-04T12:53:07.9580905Z * [new branch] gh/nikitaved/9/base -> origin/gh/nikitaved/9/base 2025-12-04T12:53:07.9580975Z * [new branch] gh/nikitaved/9/head -> origin/gh/nikitaved/9/head 2025-12-04T12:53:07.9581045Z * [new branch] gh/nikitaved/9/orig -> origin/gh/nikitaved/9/orig 2025-12-04T12:53:07.9581113Z * [new branch] gh/oulgen/10/base -> origin/gh/oulgen/10/base 2025-12-04T12:53:07.9581181Z * [new branch] gh/oulgen/10/head -> origin/gh/oulgen/10/head 2025-12-04T12:53:07.9581248Z * [new branch] gh/oulgen/10/orig -> origin/gh/oulgen/10/orig 2025-12-04T12:53:07.9581359Z * [new branch] gh/oulgen/11/base -> origin/gh/oulgen/11/base 2025-12-04T12:53:07.9581427Z * [new branch] gh/oulgen/11/head -> origin/gh/oulgen/11/head 2025-12-04T12:53:07.9581492Z * [new branch] gh/oulgen/11/orig -> origin/gh/oulgen/11/orig 2025-12-04T12:53:07.9581558Z * [new branch] gh/oulgen/12/base -> origin/gh/oulgen/12/base 2025-12-04T12:53:07.9581626Z * [new branch] gh/oulgen/12/head -> origin/gh/oulgen/12/head 2025-12-04T12:53:07.9581692Z * [new branch] gh/oulgen/12/orig -> origin/gh/oulgen/12/orig 2025-12-04T12:53:07.9581757Z * [new branch] gh/oulgen/13/base -> origin/gh/oulgen/13/base 2025-12-04T12:53:07.9581824Z * [new branch] gh/oulgen/13/head -> origin/gh/oulgen/13/head 2025-12-04T12:53:07.9581888Z * [new branch] gh/oulgen/13/orig -> origin/gh/oulgen/13/orig 2025-12-04T12:53:07.9581955Z * [new branch] gh/oulgen/14/base -> origin/gh/oulgen/14/base 2025-12-04T12:53:07.9582021Z * [new branch] gh/oulgen/14/head -> origin/gh/oulgen/14/head 2025-12-04T12:53:07.9582086Z * [new branch] gh/oulgen/14/orig -> origin/gh/oulgen/14/orig 2025-12-04T12:53:07.9582151Z * [new branch] gh/oulgen/15/base -> origin/gh/oulgen/15/base 2025-12-04T12:53:07.9582219Z * [new branch] gh/oulgen/15/head -> origin/gh/oulgen/15/head 2025-12-04T12:53:07.9582286Z * [new branch] gh/oulgen/15/orig -> origin/gh/oulgen/15/orig 2025-12-04T12:53:07.9582352Z * [new branch] gh/oulgen/16/base -> origin/gh/oulgen/16/base 2025-12-04T12:53:07.9582418Z * [new branch] gh/oulgen/16/head -> origin/gh/oulgen/16/head 2025-12-04T12:53:07.9582483Z * [new branch] gh/oulgen/16/orig -> origin/gh/oulgen/16/orig 2025-12-04T12:53:07.9582550Z * [new branch] gh/oulgen/17/base -> origin/gh/oulgen/17/base 2025-12-04T12:53:07.9582615Z * [new branch] gh/oulgen/17/head -> origin/gh/oulgen/17/head 2025-12-04T12:53:07.9582681Z * [new branch] gh/oulgen/17/orig -> origin/gh/oulgen/17/orig 2025-12-04T12:53:07.9582748Z * [new branch] gh/oulgen/18/base -> origin/gh/oulgen/18/base 2025-12-04T12:53:07.9582814Z * [new branch] gh/oulgen/18/head -> origin/gh/oulgen/18/head 2025-12-04T12:53:07.9582879Z * [new branch] gh/oulgen/18/orig -> origin/gh/oulgen/18/orig 2025-12-04T12:53:07.9582979Z * [new branch] gh/oulgen/19/base -> origin/gh/oulgen/19/base 2025-12-04T12:53:07.9583045Z * [new branch] gh/oulgen/19/head -> origin/gh/oulgen/19/head 2025-12-04T12:53:07.9583110Z * [new branch] gh/oulgen/19/orig -> origin/gh/oulgen/19/orig 2025-12-04T12:53:07.9583178Z * [new branch] gh/oulgen/20/base -> origin/gh/oulgen/20/base 2025-12-04T12:53:07.9583243Z * [new branch] gh/oulgen/20/head -> origin/gh/oulgen/20/head 2025-12-04T12:53:07.9583307Z * [new branch] gh/oulgen/20/orig -> origin/gh/oulgen/20/orig 2025-12-04T12:53:07.9583376Z * [new branch] gh/oulgen/21/base -> origin/gh/oulgen/21/base 2025-12-04T12:53:07.9583441Z * [new branch] gh/oulgen/21/head -> origin/gh/oulgen/21/head 2025-12-04T12:53:07.9583506Z * [new branch] gh/oulgen/21/orig -> origin/gh/oulgen/21/orig 2025-12-04T12:53:07.9583574Z * [new branch] gh/oulgen/22/base -> origin/gh/oulgen/22/base 2025-12-04T12:53:07.9583639Z * [new branch] gh/oulgen/22/head -> origin/gh/oulgen/22/head 2025-12-04T12:53:07.9583704Z * [new branch] gh/oulgen/22/orig -> origin/gh/oulgen/22/orig 2025-12-04T12:53:07.9583807Z * [new branch] gh/oulgen/23/base -> origin/gh/oulgen/23/base 2025-12-04T12:53:07.9583875Z * [new branch] gh/oulgen/23/head -> origin/gh/oulgen/23/head 2025-12-04T12:53:07.9583940Z * [new branch] gh/oulgen/23/orig -> origin/gh/oulgen/23/orig 2025-12-04T12:53:07.9584007Z * [new branch] gh/oulgen/24/base -> origin/gh/oulgen/24/base 2025-12-04T12:53:07.9584072Z * [new branch] gh/oulgen/24/head -> origin/gh/oulgen/24/head 2025-12-04T12:53:07.9584138Z * [new branch] gh/oulgen/24/orig -> origin/gh/oulgen/24/orig 2025-12-04T12:53:07.9584205Z * [new branch] gh/oulgen/25/base -> origin/gh/oulgen/25/base 2025-12-04T12:53:07.9584270Z * [new branch] gh/oulgen/25/head -> origin/gh/oulgen/25/head 2025-12-04T12:53:07.9584338Z * [new branch] gh/oulgen/25/orig -> origin/gh/oulgen/25/orig 2025-12-04T12:53:07.9584405Z * [new branch] gh/oulgen/26/base -> origin/gh/oulgen/26/base 2025-12-04T12:53:07.9584470Z * [new branch] gh/oulgen/26/head -> origin/gh/oulgen/26/head 2025-12-04T12:53:07.9584537Z * [new branch] gh/oulgen/26/orig -> origin/gh/oulgen/26/orig 2025-12-04T12:53:07.9584604Z * [new branch] gh/oulgen/4/base -> origin/gh/oulgen/4/base 2025-12-04T12:53:07.9584670Z * [new branch] gh/oulgen/4/head -> origin/gh/oulgen/4/head 2025-12-04T12:53:07.9584736Z * [new branch] gh/oulgen/4/orig -> origin/gh/oulgen/4/orig 2025-12-04T12:53:07.9584803Z * [new branch] gh/oulgen/7/base -> origin/gh/oulgen/7/base 2025-12-04T12:53:07.9584869Z * [new branch] gh/oulgen/7/head -> origin/gh/oulgen/7/head 2025-12-04T12:53:07.9584934Z * [new branch] gh/oulgen/7/orig -> origin/gh/oulgen/7/orig 2025-12-04T12:53:07.9584998Z * [new branch] gh/oulgen/8/base -> origin/gh/oulgen/8/base 2025-12-04T12:53:07.9585063Z * [new branch] gh/oulgen/8/head -> origin/gh/oulgen/8/head 2025-12-04T12:53:07.9585130Z * [new branch] gh/oulgen/8/orig -> origin/gh/oulgen/8/orig 2025-12-04T12:53:07.9585194Z * [new branch] gh/oulgen/9/base -> origin/gh/oulgen/9/base 2025-12-04T12:53:07.9585259Z * [new branch] gh/oulgen/9/head -> origin/gh/oulgen/9/head 2025-12-04T12:53:07.9585324Z * [new branch] gh/oulgen/9/orig -> origin/gh/oulgen/9/orig 2025-12-04T12:53:07.9585454Z * [new branch] gh/patvig/mtia-serialization -> origin/gh/patvig/mtia-serialization 2025-12-04T12:53:07.9585522Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-12-04T12:53:07.9585590Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-12-04T12:53:07.9585656Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-12-04T12:53:07.9585723Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-12-04T12:53:07.9585790Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-12-04T12:53:07.9585855Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-12-04T12:53:07.9585922Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-12-04T12:53:07.9585987Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-12-04T12:53:07.9586056Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-12-04T12:53:07.9586122Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-12-04T12:53:07.9586188Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-12-04T12:53:07.9586252Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-12-04T12:53:07.9586342Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-12-04T12:53:07.9586408Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-12-04T12:53:07.9586475Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-12-04T12:53:07.9586547Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-12-04T12:53:07.9586614Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-12-04T12:53:07.9586682Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-12-04T12:53:07.9586751Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-12-04T12:53:07.9586817Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-12-04T12:53:07.9586884Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-12-04T12:53:07.9586956Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-12-04T12:53:07.9587023Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-12-04T12:53:07.9587089Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-12-04T12:53:07.9587279Z * [new branch] gh/pearu/118/base -> origin/gh/pearu/118/base 2025-12-04T12:53:07.9587348Z * [new branch] gh/pearu/118/head -> origin/gh/pearu/118/head 2025-12-04T12:53:07.9587414Z * [new branch] gh/pearu/118/orig -> origin/gh/pearu/118/orig 2025-12-04T12:53:07.9587485Z * [new branch] gh/pearu/119/base -> origin/gh/pearu/119/base 2025-12-04T12:53:07.9587552Z * [new branch] gh/pearu/119/head -> origin/gh/pearu/119/head 2025-12-04T12:53:07.9587622Z * [new branch] gh/pearu/119/orig -> origin/gh/pearu/119/orig 2025-12-04T12:53:07.9587690Z * [new branch] gh/pearu/139/base -> origin/gh/pearu/139/base 2025-12-04T12:53:07.9587759Z * [new branch] gh/pearu/139/head -> origin/gh/pearu/139/head 2025-12-04T12:53:07.9587830Z * [new branch] gh/pearu/139/orig -> origin/gh/pearu/139/orig 2025-12-04T12:53:07.9587897Z * [new branch] gh/pearu/140/base -> origin/gh/pearu/140/base 2025-12-04T12:53:07.9587964Z * [new branch] gh/pearu/140/head -> origin/gh/pearu/140/head 2025-12-04T12:53:07.9588041Z * [new branch] gh/pearu/140/orig -> origin/gh/pearu/140/orig 2025-12-04T12:53:07.9588143Z * [new branch] gh/pearu/142/base -> origin/gh/pearu/142/base 2025-12-04T12:53:07.9588210Z * [new branch] gh/pearu/142/head -> origin/gh/pearu/142/head 2025-12-04T12:53:07.9588278Z * [new branch] gh/pearu/142/orig -> origin/gh/pearu/142/orig 2025-12-04T12:53:07.9588346Z * [new branch] gh/pearu/143/base -> origin/gh/pearu/143/base 2025-12-04T12:53:07.9588415Z * [new branch] gh/pearu/143/head -> origin/gh/pearu/143/head 2025-12-04T12:53:07.9588484Z * [new branch] gh/pearu/143/orig -> origin/gh/pearu/143/orig 2025-12-04T12:53:07.9588550Z * [new branch] gh/pearu/147/base -> origin/gh/pearu/147/base 2025-12-04T12:53:07.9588617Z * [new branch] gh/pearu/147/head -> origin/gh/pearu/147/head 2025-12-04T12:53:07.9588691Z * [new branch] gh/pearu/147/orig -> origin/gh/pearu/147/orig 2025-12-04T12:53:07.9588762Z * [new branch] gh/pearu/149/base -> origin/gh/pearu/149/base 2025-12-04T12:53:07.9588829Z * [new branch] gh/pearu/149/head -> origin/gh/pearu/149/head 2025-12-04T12:53:07.9588898Z * [new branch] gh/pearu/149/orig -> origin/gh/pearu/149/orig 2025-12-04T12:53:07.9588992Z * [new branch] gh/pearu/150/base -> origin/gh/pearu/150/base 2025-12-04T12:53:07.9589060Z * [new branch] gh/pearu/150/head -> origin/gh/pearu/150/head 2025-12-04T12:53:07.9589130Z * [new branch] gh/pearu/150/orig -> origin/gh/pearu/150/orig 2025-12-04T12:53:07.9612138Z * [new branch] gh/pearu/151/base -> origin/gh/pearu/151/base 2025-12-04T12:53:07.9612241Z * [new branch] gh/pearu/151/head -> origin/gh/pearu/151/head 2025-12-04T12:53:07.9612312Z * [new branch] gh/pearu/151/orig -> origin/gh/pearu/151/orig 2025-12-04T12:53:07.9612384Z * [new branch] gh/pearu/152/base -> origin/gh/pearu/152/base 2025-12-04T12:53:07.9612450Z * [new branch] gh/pearu/152/head -> origin/gh/pearu/152/head 2025-12-04T12:53:07.9612516Z * [new branch] gh/pearu/152/orig -> origin/gh/pearu/152/orig 2025-12-04T12:53:07.9612584Z * [new branch] gh/pearu/153/base -> origin/gh/pearu/153/base 2025-12-04T12:53:07.9612649Z * [new branch] gh/pearu/153/head -> origin/gh/pearu/153/head 2025-12-04T12:53:07.9612715Z * [new branch] gh/pearu/153/orig -> origin/gh/pearu/153/orig 2025-12-04T12:53:07.9612781Z * [new branch] gh/pearu/154/base -> origin/gh/pearu/154/base 2025-12-04T12:53:07.9612846Z * [new branch] gh/pearu/154/head -> origin/gh/pearu/154/head 2025-12-04T12:53:07.9612911Z * [new branch] gh/pearu/154/orig -> origin/gh/pearu/154/orig 2025-12-04T12:53:07.9612980Z * [new branch] gh/pearu/155/base -> origin/gh/pearu/155/base 2025-12-04T12:53:07.9613045Z * [new branch] gh/pearu/155/head -> origin/gh/pearu/155/head 2025-12-04T12:53:07.9613112Z * [new branch] gh/pearu/155/orig -> origin/gh/pearu/155/orig 2025-12-04T12:53:07.9613177Z * [new branch] gh/pearu/156/base -> origin/gh/pearu/156/base 2025-12-04T12:53:07.9613245Z * [new branch] gh/pearu/156/head -> origin/gh/pearu/156/head 2025-12-04T12:53:07.9613311Z * [new branch] gh/pearu/156/orig -> origin/gh/pearu/156/orig 2025-12-04T12:53:07.9613378Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-12-04T12:53:07.9613443Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-12-04T12:53:07.9613510Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-12-04T12:53:07.9613682Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-12-04T12:53:07.9613746Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-12-04T12:53:07.9613811Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-12-04T12:53:07.9613887Z * [new branch] gh/pianpwk/21/base -> origin/gh/pianpwk/21/base 2025-12-04T12:53:07.9613959Z * [new branch] gh/pianpwk/21/head -> origin/gh/pianpwk/21/head 2025-12-04T12:53:07.9614029Z * [new branch] gh/pianpwk/28/base -> origin/gh/pianpwk/28/base 2025-12-04T12:53:07.9614097Z * [new branch] gh/pianpwk/28/head -> origin/gh/pianpwk/28/head 2025-12-04T12:53:07.9614165Z * [new branch] gh/pianpwk/28/orig -> origin/gh/pianpwk/28/orig 2025-12-04T12:53:07.9614233Z * [new branch] gh/pianpwk/29/base -> origin/gh/pianpwk/29/base 2025-12-04T12:53:07.9614304Z * [new branch] gh/pianpwk/29/head -> origin/gh/pianpwk/29/head 2025-12-04T12:53:07.9614372Z * [new branch] gh/pianpwk/29/orig -> origin/gh/pianpwk/29/orig 2025-12-04T12:53:07.9614441Z * [new branch] gh/pianpwk/30/base -> origin/gh/pianpwk/30/base 2025-12-04T12:53:07.9614509Z * [new branch] gh/pianpwk/30/head -> origin/gh/pianpwk/30/head 2025-12-04T12:53:07.9614632Z * [new branch] gh/pianpwk/30/orig -> origin/gh/pianpwk/30/orig 2025-12-04T12:53:07.9614702Z * [new branch] gh/pianpwk/31/base -> origin/gh/pianpwk/31/base 2025-12-04T12:53:07.9614771Z * [new branch] gh/pianpwk/31/head -> origin/gh/pianpwk/31/head 2025-12-04T12:53:07.9614840Z * [new branch] gh/pianpwk/31/orig -> origin/gh/pianpwk/31/orig 2025-12-04T12:53:07.9614910Z * [new branch] gh/pianpwk/32/base -> origin/gh/pianpwk/32/base 2025-12-04T12:53:07.9614981Z * [new branch] gh/pianpwk/32/head -> origin/gh/pianpwk/32/head 2025-12-04T12:53:07.9615054Z * [new branch] gh/pianpwk/32/orig -> origin/gh/pianpwk/32/orig 2025-12-04T12:53:07.9615123Z * [new branch] gh/pianpwk/33/base -> origin/gh/pianpwk/33/base 2025-12-04T12:53:07.9615191Z * [new branch] gh/pianpwk/33/head -> origin/gh/pianpwk/33/head 2025-12-04T12:53:07.9615264Z * [new branch] gh/pianpwk/33/orig -> origin/gh/pianpwk/33/orig 2025-12-04T12:53:07.9615332Z * [new branch] gh/pianpwk/34/base -> origin/gh/pianpwk/34/base 2025-12-04T12:53:07.9615401Z * [new branch] gh/pianpwk/34/head -> origin/gh/pianpwk/34/head 2025-12-04T12:53:07.9615472Z * [new branch] gh/pianpwk/34/orig -> origin/gh/pianpwk/34/orig 2025-12-04T12:53:07.9615540Z * [new branch] gh/pianpwk/35/base -> origin/gh/pianpwk/35/base 2025-12-04T12:53:07.9615611Z * [new branch] gh/pianpwk/35/head -> origin/gh/pianpwk/35/head 2025-12-04T12:53:07.9615686Z * [new branch] gh/pianpwk/35/orig -> origin/gh/pianpwk/35/orig 2025-12-04T12:53:07.9615751Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-12-04T12:53:07.9615818Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-12-04T12:53:07.9615883Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-12-04T12:53:07.9615945Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-12-04T12:53:07.9616010Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-12-04T12:53:07.9616073Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-12-04T12:53:07.9616135Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-12-04T12:53:07.9616197Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-12-04T12:53:07.9616293Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-12-04T12:53:07.9616357Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-12-04T12:53:07.9616419Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-12-04T12:53:07.9616484Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-12-04T12:53:07.9616546Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-12-04T12:53:07.9616610Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-12-04T12:53:07.9616671Z * [new branch] gh/rec/167/base -> origin/gh/rec/167/base 2025-12-04T12:53:07.9616733Z * [new branch] gh/rec/167/head -> origin/gh/rec/167/head 2025-12-04T12:53:07.9616796Z * [new branch] gh/rec/167/orig -> origin/gh/rec/167/orig 2025-12-04T12:53:07.9616863Z * [new branch] gh/rec/168/base -> origin/gh/rec/168/base 2025-12-04T12:53:07.9616924Z * [new branch] gh/rec/168/head -> origin/gh/rec/168/head 2025-12-04T12:53:07.9616989Z * [new branch] gh/rec/168/orig -> origin/gh/rec/168/orig 2025-12-04T12:53:07.9617050Z * [new branch] gh/rec/169/base -> origin/gh/rec/169/base 2025-12-04T12:53:07.9617139Z * [new branch] gh/rec/169/head -> origin/gh/rec/169/head 2025-12-04T12:53:07.9617205Z * [new branch] gh/rec/169/orig -> origin/gh/rec/169/orig 2025-12-04T12:53:07.9617266Z * [new branch] gh/rec/170/base -> origin/gh/rec/170/base 2025-12-04T12:53:07.9617328Z * [new branch] gh/rec/170/head -> origin/gh/rec/170/head 2025-12-04T12:53:07.9617391Z * [new branch] gh/rec/170/orig -> origin/gh/rec/170/orig 2025-12-04T12:53:07.9617455Z * [new branch] gh/rec/171/base -> origin/gh/rec/171/base 2025-12-04T12:53:07.9617516Z * [new branch] gh/rec/171/head -> origin/gh/rec/171/head 2025-12-04T12:53:07.9617579Z * [new branch] gh/rec/171/orig -> origin/gh/rec/171/orig 2025-12-04T12:53:07.9617640Z * [new branch] gh/rec/172/base -> origin/gh/rec/172/base 2025-12-04T12:53:07.9617703Z * [new branch] gh/rec/172/head -> origin/gh/rec/172/head 2025-12-04T12:53:07.9617766Z * [new branch] gh/rec/172/orig -> origin/gh/rec/172/orig 2025-12-04T12:53:07.9617827Z * [new branch] gh/rec/173/base -> origin/gh/rec/173/base 2025-12-04T12:53:07.9617889Z * [new branch] gh/rec/173/head -> origin/gh/rec/173/head 2025-12-04T12:53:07.9617953Z * [new branch] gh/rec/173/orig -> origin/gh/rec/173/orig 2025-12-04T12:53:07.9618016Z * [new branch] gh/rec/174/base -> origin/gh/rec/174/base 2025-12-04T12:53:07.9618079Z * [new branch] gh/rec/174/head -> origin/gh/rec/174/head 2025-12-04T12:53:07.9618144Z * [new branch] gh/rec/174/orig -> origin/gh/rec/174/orig 2025-12-04T12:53:07.9618210Z * [new branch] gh/rec/175/base -> origin/gh/rec/175/base 2025-12-04T12:53:07.9618278Z * [new branch] gh/rec/175/head -> origin/gh/rec/175/head 2025-12-04T12:53:07.9618493Z * [new branch] gh/rec/175/orig -> origin/gh/rec/175/orig 2025-12-04T12:53:07.9618556Z * [new branch] gh/rec/176/base -> origin/gh/rec/176/base 2025-12-04T12:53:07.9618621Z * [new branch] gh/rec/176/head -> origin/gh/rec/176/head 2025-12-04T12:53:07.9618682Z * [new branch] gh/rec/176/orig -> origin/gh/rec/176/orig 2025-12-04T12:53:07.9618746Z * [new branch] gh/rec/177/base -> origin/gh/rec/177/base 2025-12-04T12:53:07.9618845Z * [new branch] gh/rec/177/head -> origin/gh/rec/177/head 2025-12-04T12:53:07.9618909Z * [new branch] gh/rec/177/orig -> origin/gh/rec/177/orig 2025-12-04T12:53:07.9618970Z * [new branch] gh/rec/178/base -> origin/gh/rec/178/base 2025-12-04T12:53:07.9619035Z * [new branch] gh/rec/178/head -> origin/gh/rec/178/head 2025-12-04T12:53:07.9619100Z * [new branch] gh/rec/178/orig -> origin/gh/rec/178/orig 2025-12-04T12:53:07.9619195Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-12-04T12:53:07.9619281Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-12-04T12:53:07.9619362Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-12-04T12:53:07.9619443Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-12-04T12:53:07.9619526Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-12-04T12:53:07.9619607Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-12-04T12:53:07.9619689Z * [new branch] gh/robert-hardwick/5/base -> origin/gh/robert-hardwick/5/base 2025-12-04T12:53:07.9619798Z * [new branch] gh/robert-hardwick/5/head -> origin/gh/robert-hardwick/5/head 2025-12-04T12:53:07.9619878Z * [new branch] gh/robert-hardwick/5/orig -> origin/gh/robert-hardwick/5/orig 2025-12-04T12:53:07.9619958Z * [new branch] gh/robert-hardwick/6/base -> origin/gh/robert-hardwick/6/base 2025-12-04T12:53:07.9620040Z * [new branch] gh/robert-hardwick/6/head -> origin/gh/robert-hardwick/6/head 2025-12-04T12:53:07.9620119Z * [new branch] gh/robert-hardwick/6/orig -> origin/gh/robert-hardwick/6/orig 2025-12-04T12:53:07.9620259Z * [new branch] gh/robert-hardwick/7/base -> origin/gh/robert-hardwick/7/base 2025-12-04T12:53:07.9620343Z * [new branch] gh/robert-hardwick/7/head -> origin/gh/robert-hardwick/7/head 2025-12-04T12:53:07.9620423Z * [new branch] gh/robert-hardwick/7/orig -> origin/gh/robert-hardwick/7/orig 2025-12-04T12:53:07.9620505Z * [new branch] gh/robert-hardwick/8/base -> origin/gh/robert-hardwick/8/base 2025-12-04T12:53:07.9620586Z * [new branch] gh/robert-hardwick/8/head -> origin/gh/robert-hardwick/8/head 2025-12-04T12:53:07.9620666Z * [new branch] gh/robert-hardwick/8/orig -> origin/gh/robert-hardwick/8/orig 2025-12-04T12:53:07.9620749Z * [new branch] gh/robert-hardwick/9/base -> origin/gh/robert-hardwick/9/base 2025-12-04T12:53:07.9620829Z * [new branch] gh/robert-hardwick/9/head -> origin/gh/robert-hardwick/9/head 2025-12-04T12:53:07.9620910Z * [new branch] gh/robert-hardwick/9/orig -> origin/gh/robert-hardwick/9/orig 2025-12-04T12:53:07.9620984Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-12-04T12:53:07.9621052Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-12-04T12:53:07.9621120Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-12-04T12:53:07.9621191Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-12-04T12:53:07.9621259Z * [new branch] gh/rtimpe/22/base -> origin/gh/rtimpe/22/base 2025-12-04T12:53:07.9621326Z * [new branch] gh/rtimpe/22/head -> origin/gh/rtimpe/22/head 2025-12-04T12:53:07.9621395Z * [new branch] gh/rtimpe/22/orig -> origin/gh/rtimpe/22/orig 2025-12-04T12:53:07.9621461Z * [new branch] gh/rtimpe/23/base -> origin/gh/rtimpe/23/base 2025-12-04T12:53:07.9621527Z * [new branch] gh/rtimpe/23/head -> origin/gh/rtimpe/23/head 2025-12-04T12:53:07.9621645Z * [new branch] gh/rtimpe/23/orig -> origin/gh/rtimpe/23/orig 2025-12-04T12:53:07.9621710Z * [new branch] gh/rtimpe/24/base -> origin/gh/rtimpe/24/base 2025-12-04T12:53:07.9621776Z * [new branch] gh/rtimpe/24/head -> origin/gh/rtimpe/24/head 2025-12-04T12:53:07.9621846Z * [new branch] gh/rtimpe/24/orig -> origin/gh/rtimpe/24/orig 2025-12-04T12:53:07.9621912Z * [new branch] gh/rtimpe/25/base -> origin/gh/rtimpe/25/base 2025-12-04T12:53:07.9621977Z * [new branch] gh/rtimpe/25/head -> origin/gh/rtimpe/25/head 2025-12-04T12:53:07.9622043Z * [new branch] gh/rtimpe/25/orig -> origin/gh/rtimpe/25/orig 2025-12-04T12:53:07.9622108Z * [new branch] gh/rtimpe/26/base -> origin/gh/rtimpe/26/base 2025-12-04T12:53:07.9622175Z * [new branch] gh/rtimpe/26/head -> origin/gh/rtimpe/26/head 2025-12-04T12:53:07.9622242Z * [new branch] gh/rtimpe/26/orig -> origin/gh/rtimpe/26/orig 2025-12-04T12:53:07.9622310Z * [new branch] gh/rtimpe/27/base -> origin/gh/rtimpe/27/base 2025-12-04T12:53:07.9622379Z * [new branch] gh/rtimpe/27/head -> origin/gh/rtimpe/27/head 2025-12-04T12:53:07.9622445Z * [new branch] gh/rtimpe/27/orig -> origin/gh/rtimpe/27/orig 2025-12-04T12:53:07.9622555Z * [new branch] gh/rtimpe/28/base -> origin/gh/rtimpe/28/base 2025-12-04T12:53:07.9622622Z * [new branch] gh/rtimpe/28/head -> origin/gh/rtimpe/28/head 2025-12-04T12:53:07.9622688Z * [new branch] gh/rtimpe/28/orig -> origin/gh/rtimpe/28/orig 2025-12-04T12:53:07.9622754Z * [new branch] gh/rtimpe/29/base -> origin/gh/rtimpe/29/base 2025-12-04T12:53:07.9622821Z * [new branch] gh/rtimpe/29/head -> origin/gh/rtimpe/29/head 2025-12-04T12:53:07.9622890Z * [new branch] gh/rtimpe/29/orig -> origin/gh/rtimpe/29/orig 2025-12-04T12:53:07.9622955Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-12-04T12:53:07.9623024Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-12-04T12:53:07.9623090Z * [new branch] gh/rtimpe/30/base -> origin/gh/rtimpe/30/base 2025-12-04T12:53:07.9623157Z * [new branch] gh/rtimpe/30/head -> origin/gh/rtimpe/30/head 2025-12-04T12:53:07.9623223Z * [new branch] gh/rtimpe/30/orig -> origin/gh/rtimpe/30/orig 2025-12-04T12:53:07.9623290Z * [new branch] gh/rtimpe/31/base -> origin/gh/rtimpe/31/base 2025-12-04T12:53:07.9623355Z * [new branch] gh/rtimpe/31/head -> origin/gh/rtimpe/31/head 2025-12-04T12:53:07.9623422Z * [new branch] gh/rtimpe/31/orig -> origin/gh/rtimpe/31/orig 2025-12-04T12:53:07.9623490Z * [new branch] gh/rtimpe/32/base -> origin/gh/rtimpe/32/base 2025-12-04T12:53:07.9623555Z * [new branch] gh/rtimpe/32/head -> origin/gh/rtimpe/32/head 2025-12-04T12:53:07.9623622Z * [new branch] gh/rtimpe/32/orig -> origin/gh/rtimpe/32/orig 2025-12-04T12:53:07.9623687Z * [new branch] gh/rtimpe/33/base -> origin/gh/rtimpe/33/base 2025-12-04T12:53:07.9623753Z * [new branch] gh/rtimpe/33/head -> origin/gh/rtimpe/33/head 2025-12-04T12:53:07.9623820Z * [new branch] gh/rtimpe/33/orig -> origin/gh/rtimpe/33/orig 2025-12-04T12:53:07.9623886Z * [new branch] gh/rtimpe/34/base -> origin/gh/rtimpe/34/base 2025-12-04T12:53:07.9623954Z * [new branch] gh/rtimpe/34/head -> origin/gh/rtimpe/34/head 2025-12-04T12:53:07.9624020Z * [new branch] gh/rtimpe/34/orig -> origin/gh/rtimpe/34/orig 2025-12-04T12:53:07.9624117Z * [new branch] gh/rtimpe/35/base -> origin/gh/rtimpe/35/base 2025-12-04T12:53:07.9624184Z * [new branch] gh/rtimpe/35/head -> origin/gh/rtimpe/35/head 2025-12-04T12:53:07.9624249Z * [new branch] gh/rtimpe/35/orig -> origin/gh/rtimpe/35/orig 2025-12-04T12:53:07.9624314Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-12-04T12:53:07.9624382Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-12-04T12:53:07.9624463Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-12-04T12:53:07.9624541Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-12-04T12:53:07.9624618Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-12-04T12:53:07.9624692Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-12-04T12:53:07.9624768Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-12-04T12:53:07.9624844Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-12-04T12:53:07.9624918Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-12-04T12:53:07.9624992Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-12-04T12:53:07.9625094Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-12-04T12:53:07.9625168Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-12-04T12:53:07.9625242Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-12-04T12:53:07.9625317Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-12-04T12:53:07.9625391Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-12-04T12:53:07.9625467Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-12-04T12:53:07.9625543Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-12-04T12:53:07.9625619Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-12-04T12:53:07.9625695Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-12-04T12:53:07.9625769Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-12-04T12:53:07.9625842Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-12-04T12:53:07.9625918Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-12-04T12:53:07.9625991Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-12-04T12:53:07.9626067Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-12-04T12:53:07.9626142Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-12-04T12:53:07.9626215Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-12-04T12:53:07.9626287Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-12-04T12:53:07.9626362Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-12-04T12:53:07.9626434Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-12-04T12:53:07.9626507Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-12-04T12:53:07.9626579Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-12-04T12:53:07.9626650Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-12-04T12:53:07.9626722Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-12-04T12:53:07.9626825Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-12-04T12:53:07.9626896Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-12-04T12:53:07.9626968Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-12-04T12:53:07.9627042Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-12-04T12:53:07.9627114Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-12-04T12:53:07.9627192Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-12-04T12:53:07.9627265Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-12-04T12:53:07.9627337Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-12-04T12:53:07.9627413Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-12-04T12:53:07.9627484Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-12-04T12:53:07.9627555Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-12-04T12:53:07.9627628Z * [new branch] gh/seemethere/71/base -> origin/gh/seemethere/71/base 2025-12-04T12:53:07.9627730Z * [new branch] gh/seemethere/71/head -> origin/gh/seemethere/71/head 2025-12-04T12:53:07.9627804Z * [new branch] gh/seemethere/71/orig -> origin/gh/seemethere/71/orig 2025-12-04T12:53:07.9627880Z * [new branch] gh/seemethere/72/base -> origin/gh/seemethere/72/base 2025-12-04T12:53:07.9627952Z * [new branch] gh/seemethere/72/head -> origin/gh/seemethere/72/head 2025-12-04T12:53:07.9628026Z * [new branch] gh/seemethere/72/orig -> origin/gh/seemethere/72/orig 2025-12-04T12:53:07.9628103Z * [new branch] gh/seemethere/73/base -> origin/gh/seemethere/73/base 2025-12-04T12:53:07.9628174Z * [new branch] gh/seemethere/73/head -> origin/gh/seemethere/73/head 2025-12-04T12:53:07.9628247Z * [new branch] gh/seemethere/73/orig -> origin/gh/seemethere/73/orig 2025-12-04T12:53:07.9628320Z * [new branch] gh/seemethere/74/base -> origin/gh/seemethere/74/base 2025-12-04T12:53:07.9628395Z * [new branch] gh/seemethere/74/head -> origin/gh/seemethere/74/head 2025-12-04T12:53:07.9628468Z * [new branch] gh/seemethere/74/orig -> origin/gh/seemethere/74/orig 2025-12-04T12:53:07.9628542Z * [new branch] gh/seemethere/75/base -> origin/gh/seemethere/75/base 2025-12-04T12:53:07.9628612Z * [new branch] gh/seemethere/75/head -> origin/gh/seemethere/75/head 2025-12-04T12:53:07.9628685Z * [new branch] gh/seemethere/75/orig -> origin/gh/seemethere/75/orig 2025-12-04T12:53:07.9628758Z * [new branch] gh/seemethere/76/base -> origin/gh/seemethere/76/base 2025-12-04T12:53:07.9628829Z * [new branch] gh/seemethere/76/head -> origin/gh/seemethere/76/head 2025-12-04T12:53:07.9628901Z * [new branch] gh/seemethere/76/orig -> origin/gh/seemethere/76/orig 2025-12-04T12:53:07.9628978Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-12-04T12:53:07.9629053Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-12-04T12:53:07.9629128Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-12-04T12:53:07.9629202Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-12-04T12:53:07.9629275Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-12-04T12:53:07.9629351Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-12-04T12:53:07.9629453Z * [new branch] gh/shunting314/249/base -> origin/gh/shunting314/249/base 2025-12-04T12:53:07.9629526Z * [new branch] gh/shunting314/249/head -> origin/gh/shunting314/249/head 2025-12-04T12:53:07.9629601Z * [new branch] gh/shunting314/249/orig -> origin/gh/shunting314/249/orig 2025-12-04T12:53:07.9629675Z * [new branch] gh/shunting314/253/base -> origin/gh/shunting314/253/base 2025-12-04T12:53:07.9629747Z * [new branch] gh/shunting314/253/head -> origin/gh/shunting314/253/head 2025-12-04T12:53:07.9629822Z * [new branch] gh/shunting314/253/orig -> origin/gh/shunting314/253/orig 2025-12-04T12:53:07.9629895Z * [new branch] gh/shunting314/256/base -> origin/gh/shunting314/256/base 2025-12-04T12:53:07.9629967Z * [new branch] gh/shunting314/256/head -> origin/gh/shunting314/256/head 2025-12-04T12:53:07.9630043Z * [new branch] gh/shunting314/256/orig -> origin/gh/shunting314/256/orig 2025-12-04T12:53:07.9630117Z * [new branch] gh/shunting314/257/base -> origin/gh/shunting314/257/base 2025-12-04T12:53:07.9630227Z * [new branch] gh/shunting314/257/head -> origin/gh/shunting314/257/head 2025-12-04T12:53:07.9630301Z * [new branch] gh/shunting314/257/orig -> origin/gh/shunting314/257/orig 2025-12-04T12:53:07.9630423Z * [new branch] gh/shunting314/258/base -> origin/gh/shunting314/258/base 2025-12-04T12:53:07.9630500Z * [new branch] gh/shunting314/258/head -> origin/gh/shunting314/258/head 2025-12-04T12:53:07.9630573Z * [new branch] gh/shunting314/258/orig -> origin/gh/shunting314/258/orig 2025-12-04T12:53:07.9630647Z * [new branch] gh/shunting314/259/base -> origin/gh/shunting314/259/base 2025-12-04T12:53:07.9630720Z * [new branch] gh/shunting314/259/head -> origin/gh/shunting314/259/head 2025-12-04T12:53:07.9630794Z * [new branch] gh/shunting314/259/orig -> origin/gh/shunting314/259/orig 2025-12-04T12:53:07.9630867Z * [new branch] gh/shunting314/260/base -> origin/gh/shunting314/260/base 2025-12-04T12:53:07.9630941Z * [new branch] gh/shunting314/260/head -> origin/gh/shunting314/260/head 2025-12-04T12:53:07.9631015Z * [new branch] gh/shunting314/260/orig -> origin/gh/shunting314/260/orig 2025-12-04T12:53:07.9631088Z * [new branch] gh/shunting314/261/base -> origin/gh/shunting314/261/base 2025-12-04T12:53:07.9631163Z * [new branch] gh/shunting314/261/head -> origin/gh/shunting314/261/head 2025-12-04T12:53:07.9631238Z * [new branch] gh/shunting314/261/orig -> origin/gh/shunting314/261/orig 2025-12-04T12:53:07.9631310Z * [new branch] gh/shunting314/262/base -> origin/gh/shunting314/262/base 2025-12-04T12:53:07.9631385Z * [new branch] gh/shunting314/262/head -> origin/gh/shunting314/262/head 2025-12-04T12:53:07.9631460Z * [new branch] gh/shunting314/262/orig -> origin/gh/shunting314/262/orig 2025-12-04T12:53:07.9631535Z * [new branch] gh/shunting314/263/base -> origin/gh/shunting314/263/base 2025-12-04T12:53:07.9631609Z * [new branch] gh/shunting314/263/head -> origin/gh/shunting314/263/head 2025-12-04T12:53:07.9631682Z * [new branch] gh/shunting314/263/orig -> origin/gh/shunting314/263/orig 2025-12-04T12:53:07.9631755Z * [new branch] gh/shunting314/264/base -> origin/gh/shunting314/264/base 2025-12-04T12:53:07.9631828Z * [new branch] gh/shunting314/264/head -> origin/gh/shunting314/264/head 2025-12-04T12:53:07.9631900Z * [new branch] gh/shunting314/264/orig -> origin/gh/shunting314/264/orig 2025-12-04T12:53:07.9631974Z * [new branch] gh/shunting314/265/base -> origin/gh/shunting314/265/base 2025-12-04T12:53:07.9632048Z * [new branch] gh/shunting314/265/head -> origin/gh/shunting314/265/head 2025-12-04T12:53:07.9632173Z * [new branch] gh/shunting314/265/orig -> origin/gh/shunting314/265/orig 2025-12-04T12:53:07.9632248Z * [new branch] gh/shunting314/266/base -> origin/gh/shunting314/266/base 2025-12-04T12:53:07.9632319Z * [new branch] gh/shunting314/266/head -> origin/gh/shunting314/266/head 2025-12-04T12:53:07.9632393Z * [new branch] gh/shunting314/266/orig -> origin/gh/shunting314/266/orig 2025-12-04T12:53:07.9632467Z * [new branch] gh/shunting314/267/base -> origin/gh/shunting314/267/base 2025-12-04T12:53:07.9632539Z * [new branch] gh/shunting314/267/head -> origin/gh/shunting314/267/head 2025-12-04T12:53:07.9632613Z * [new branch] gh/shunting314/267/orig -> origin/gh/shunting314/267/orig 2025-12-04T12:53:07.9632689Z * [new branch] gh/shunting314/268/base -> origin/gh/shunting314/268/base 2025-12-04T12:53:07.9632764Z * [new branch] gh/shunting314/268/head -> origin/gh/shunting314/268/head 2025-12-04T12:53:07.9632837Z * [new branch] gh/shunting314/268/orig -> origin/gh/shunting314/268/orig 2025-12-04T12:53:07.9632912Z * [new branch] gh/shunting314/269/base -> origin/gh/shunting314/269/base 2025-12-04T12:53:07.9633016Z * [new branch] gh/shunting314/269/head -> origin/gh/shunting314/269/head 2025-12-04T12:53:07.9633089Z * [new branch] gh/shunting314/269/orig -> origin/gh/shunting314/269/orig 2025-12-04T12:53:07.9633164Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-12-04T12:53:07.9633234Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-12-04T12:53:07.9633304Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-12-04T12:53:07.9633376Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-12-04T12:53:07.9633446Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-12-04T12:53:07.9633515Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-12-04T12:53:07.9633584Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-12-04T12:53:07.9633654Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-12-04T12:53:07.9633728Z * [new branch] gh/slayton58/39/base -> origin/gh/slayton58/39/base 2025-12-04T12:53:07.9633800Z * [new branch] gh/slayton58/39/head -> origin/gh/slayton58/39/head 2025-12-04T12:53:07.9633872Z * [new branch] gh/slayton58/39/orig -> origin/gh/slayton58/39/orig 2025-12-04T12:53:07.9633942Z * [new branch] gh/slayton58/42/base -> origin/gh/slayton58/42/base 2025-12-04T12:53:07.9634011Z * [new branch] gh/slayton58/42/head -> origin/gh/slayton58/42/head 2025-12-04T12:53:07.9634081Z * [new branch] gh/slayton58/42/orig -> origin/gh/slayton58/42/orig 2025-12-04T12:53:07.9634154Z * [new branch] gh/slayton58/43/base -> origin/gh/slayton58/43/base 2025-12-04T12:53:07.9634224Z * [new branch] gh/slayton58/43/head -> origin/gh/slayton58/43/head 2025-12-04T12:53:07.9634295Z * [new branch] gh/slayton58/43/orig -> origin/gh/slayton58/43/orig 2025-12-04T12:53:07.9634365Z * [new branch] gh/slayton58/44/base -> origin/gh/slayton58/44/base 2025-12-04T12:53:07.9634435Z * [new branch] gh/slayton58/44/head -> origin/gh/slayton58/44/head 2025-12-04T12:53:07.9634504Z * [new branch] gh/slayton58/44/orig -> origin/gh/slayton58/44/orig 2025-12-04T12:53:07.9634573Z * [new branch] gh/slayton58/45/base -> origin/gh/slayton58/45/base 2025-12-04T12:53:07.9634642Z * [new branch] gh/slayton58/45/head -> origin/gh/slayton58/45/head 2025-12-04T12:53:07.9634743Z * [new branch] gh/slayton58/45/orig -> origin/gh/slayton58/45/orig 2025-12-04T12:53:07.9634814Z * [new branch] gh/slayton58/46/base -> origin/gh/slayton58/46/base 2025-12-04T12:53:07.9634885Z * [new branch] gh/slayton58/46/head -> origin/gh/slayton58/46/head 2025-12-04T12:53:07.9634955Z * [new branch] gh/slayton58/46/orig -> origin/gh/slayton58/46/orig 2025-12-04T12:53:07.9635026Z * [new branch] gh/slayton58/6/base -> origin/gh/slayton58/6/base 2025-12-04T12:53:07.9635097Z * [new branch] gh/slayton58/6/head -> origin/gh/slayton58/6/head 2025-12-04T12:53:07.9635166Z * [new branch] gh/slayton58/7/base -> origin/gh/slayton58/7/base 2025-12-04T12:53:07.9635234Z * [new branch] gh/slayton58/7/head -> origin/gh/slayton58/7/head 2025-12-04T12:53:07.9635308Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-12-04T12:53:07.9635384Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-12-04T12:53:07.9635457Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-12-04T12:53:07.9635527Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-12-04T12:53:07.9635625Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-12-04T12:53:07.9635697Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-12-04T12:53:07.9635768Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-12-04T12:53:07.9635842Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-12-04T12:53:07.9635914Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-12-04T12:53:07.9635987Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-12-04T12:53:07.9636061Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-12-04T12:53:07.9636131Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-12-04T12:53:07.9636201Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-12-04T12:53:07.9636274Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-12-04T12:53:07.9636345Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-12-04T12:53:07.9636417Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-12-04T12:53:07.9636490Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-12-04T12:53:07.9636560Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-12-04T12:53:07.9636636Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-12-04T12:53:07.9636707Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-12-04T12:53:07.9636779Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-12-04T12:53:07.9636851Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-12-04T12:53:07.9636922Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-12-04T12:53:07.9636993Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-12-04T12:53:07.9637065Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-12-04T12:53:07.9637136Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-12-04T12:53:07.9637207Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-12-04T12:53:07.9637308Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-12-04T12:53:07.9637381Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-12-04T12:53:07.9637451Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-12-04T12:53:07.9637522Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-12-04T12:53:07.9637594Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-12-04T12:53:07.9637665Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-12-04T12:53:07.9637740Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-12-04T12:53:07.9637811Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-12-04T12:53:07.9637881Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-12-04T12:53:07.9637955Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-12-04T12:53:07.9638025Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-12-04T12:53:07.9638099Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-12-04T12:53:07.9638195Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-12-04T12:53:07.9638270Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-12-04T12:53:07.9638341Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-12-04T12:53:07.9638412Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-12-04T12:53:07.9638484Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-12-04T12:53:07.9638555Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-12-04T12:53:07.9638627Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-12-04T12:53:07.9638699Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-12-04T12:53:07.9638771Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-12-04T12:53:07.9638844Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-12-04T12:53:07.9638914Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-12-04T12:53:07.9638988Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-12-04T12:53:07.9639060Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-12-04T12:53:07.9639132Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-12-04T12:53:07.9639204Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-12-04T12:53:07.9639277Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-12-04T12:53:07.9639349Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-12-04T12:53:07.9639425Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-12-04T12:53:07.9639498Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-12-04T12:53:07.9639568Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-12-04T12:53:07.9639639Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-12-04T12:53:07.9639709Z * [new branch] gh/soulitzer/380/base -> origin/gh/soulitzer/380/base 2025-12-04T12:53:07.9639782Z * [new branch] gh/soulitzer/380/head -> origin/gh/soulitzer/380/head 2025-12-04T12:53:07.9639855Z * [new branch] gh/soulitzer/380/orig -> origin/gh/soulitzer/380/orig 2025-12-04T12:53:07.9639960Z * [new branch] gh/soulitzer/385/base -> origin/gh/soulitzer/385/base 2025-12-04T12:53:07.9640030Z * [new branch] gh/soulitzer/385/head -> origin/gh/soulitzer/385/head 2025-12-04T12:53:07.9640103Z * [new branch] gh/soulitzer/385/orig -> origin/gh/soulitzer/385/orig 2025-12-04T12:53:07.9640204Z * [new branch] gh/soulitzer/386/base -> origin/gh/soulitzer/386/base 2025-12-04T12:53:07.9640279Z * [new branch] gh/soulitzer/386/head -> origin/gh/soulitzer/386/head 2025-12-04T12:53:07.9640351Z * [new branch] gh/soulitzer/386/orig -> origin/gh/soulitzer/386/orig 2025-12-04T12:53:07.9640422Z * [new branch] gh/soulitzer/387/base -> origin/gh/soulitzer/387/base 2025-12-04T12:53:07.9640492Z * [new branch] gh/soulitzer/387/head -> origin/gh/soulitzer/387/head 2025-12-04T12:53:07.9640565Z * [new branch] gh/soulitzer/387/orig -> origin/gh/soulitzer/387/orig 2025-12-04T12:53:07.9640635Z * [new branch] gh/soulitzer/388/base -> origin/gh/soulitzer/388/base 2025-12-04T12:53:07.9640706Z * [new branch] gh/soulitzer/388/head -> origin/gh/soulitzer/388/head 2025-12-04T12:53:07.9640777Z * [new branch] gh/soulitzer/388/orig -> origin/gh/soulitzer/388/orig 2025-12-04T12:53:07.9640897Z * [new branch] gh/soulitzer/389/base -> origin/gh/soulitzer/389/base 2025-12-04T12:53:07.9640968Z * [new branch] gh/soulitzer/389/head -> origin/gh/soulitzer/389/head 2025-12-04T12:53:07.9641040Z * [new branch] gh/soulitzer/389/orig -> origin/gh/soulitzer/389/orig 2025-12-04T12:53:07.9641113Z * [new branch] gh/soulitzer/390/base -> origin/gh/soulitzer/390/base 2025-12-04T12:53:07.9641182Z * [new branch] gh/soulitzer/390/head -> origin/gh/soulitzer/390/head 2025-12-04T12:53:07.9641255Z * [new branch] gh/soulitzer/390/orig -> origin/gh/soulitzer/390/orig 2025-12-04T12:53:07.9641325Z * [new branch] gh/soulitzer/391/base -> origin/gh/soulitzer/391/base 2025-12-04T12:53:07.9641395Z * [new branch] gh/soulitzer/391/head -> origin/gh/soulitzer/391/head 2025-12-04T12:53:07.9641467Z * [new branch] gh/soulitzer/391/orig -> origin/gh/soulitzer/391/orig 2025-12-04T12:53:07.9641539Z * [new branch] gh/soulitzer/392/base -> origin/gh/soulitzer/392/base 2025-12-04T12:53:07.9641615Z * [new branch] gh/soulitzer/392/head -> origin/gh/soulitzer/392/head 2025-12-04T12:53:07.9641687Z * [new branch] gh/soulitzer/392/orig -> origin/gh/soulitzer/392/orig 2025-12-04T12:53:07.9641758Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-12-04T12:53:07.9641829Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-12-04T12:53:07.9641900Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-12-04T12:53:07.9641968Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-12-04T12:53:07.9642042Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-12-04T12:53:07.9642111Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-12-04T12:53:07.9642180Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-12-04T12:53:07.9642250Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-12-04T12:53:07.9642319Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-12-04T12:53:07.9642388Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-12-04T12:53:07.9642459Z * [new branch] gh/swolchok/839/base -> origin/gh/swolchok/839/base 2025-12-04T12:53:07.9642567Z * [new branch] gh/swolchok/839/head -> origin/gh/swolchok/839/head 2025-12-04T12:53:07.9642636Z * [new branch] gh/swolchok/839/orig -> origin/gh/swolchok/839/orig 2025-12-04T12:53:07.9642706Z * [new branch] gh/swolchok/841/base -> origin/gh/swolchok/841/base 2025-12-04T12:53:07.9642777Z * [new branch] gh/swolchok/841/head -> origin/gh/swolchok/841/head 2025-12-04T12:53:07.9642846Z * [new branch] gh/swolchok/841/orig -> origin/gh/swolchok/841/orig 2025-12-04T12:53:07.9642917Z * [new branch] gh/swolchok/842/base -> origin/gh/swolchok/842/base 2025-12-04T12:53:07.9642986Z * [new branch] gh/swolchok/842/head -> origin/gh/swolchok/842/head 2025-12-04T12:53:07.9643058Z * [new branch] gh/swolchok/842/orig -> origin/gh/swolchok/842/orig 2025-12-04T12:53:07.9643127Z * [new branch] gh/swolchok/845/base -> origin/gh/swolchok/845/base 2025-12-04T12:53:07.9643197Z * [new branch] gh/swolchok/845/head -> origin/gh/swolchok/845/head 2025-12-04T12:53:07.9643269Z * [new branch] gh/swolchok/845/orig -> origin/gh/swolchok/845/orig 2025-12-04T12:53:07.9643338Z * [new branch] gh/swolchok/848/base -> origin/gh/swolchok/848/base 2025-12-04T12:53:07.9643438Z * [new branch] gh/swolchok/848/head -> origin/gh/swolchok/848/head 2025-12-04T12:53:07.9643509Z * [new branch] gh/swolchok/848/orig -> origin/gh/swolchok/848/orig 2025-12-04T12:53:07.9643578Z * [new branch] gh/swolchok/856/base -> origin/gh/swolchok/856/base 2025-12-04T12:53:07.9643647Z * [new branch] gh/swolchok/856/head -> origin/gh/swolchok/856/head 2025-12-04T12:53:07.9643722Z * [new branch] gh/swolchok/856/orig -> origin/gh/swolchok/856/orig 2025-12-04T12:53:07.9643790Z * [new branch] gh/swolchok/860/base -> origin/gh/swolchok/860/base 2025-12-04T12:53:07.9643861Z * [new branch] gh/swolchok/860/head -> origin/gh/swolchok/860/head 2025-12-04T12:53:07.9643930Z * [new branch] gh/swolchok/860/orig -> origin/gh/swolchok/860/orig 2025-12-04T12:53:07.9643999Z * [new branch] gh/swolchok/861/base -> origin/gh/swolchok/861/base 2025-12-04T12:53:07.9644070Z * [new branch] gh/swolchok/861/head -> origin/gh/swolchok/861/head 2025-12-04T12:53:07.9644139Z * [new branch] gh/swolchok/861/orig -> origin/gh/swolchok/861/orig 2025-12-04T12:53:07.9644208Z * [new branch] gh/swolchok/862/base -> origin/gh/swolchok/862/base 2025-12-04T12:53:07.9644277Z * [new branch] gh/swolchok/862/head -> origin/gh/swolchok/862/head 2025-12-04T12:53:07.9644347Z * [new branch] gh/swolchok/862/orig -> origin/gh/swolchok/862/orig 2025-12-04T12:53:07.9644416Z * [new branch] gh/swolchok/863/base -> origin/gh/swolchok/863/base 2025-12-04T12:53:07.9644486Z * [new branch] gh/swolchok/863/head -> origin/gh/swolchok/863/head 2025-12-04T12:53:07.9644559Z * [new branch] gh/swolchok/863/orig -> origin/gh/swolchok/863/orig 2025-12-04T12:53:07.9644629Z * [new branch] gh/swolchok/864/base -> origin/gh/swolchok/864/base 2025-12-04T12:53:07.9644702Z * [new branch] gh/swolchok/864/head -> origin/gh/swolchok/864/head 2025-12-04T12:53:07.9644771Z * [new branch] gh/swolchok/864/orig -> origin/gh/swolchok/864/orig 2025-12-04T12:53:07.9644840Z * [new branch] gh/swolchok/865/base -> origin/gh/swolchok/865/base 2025-12-04T12:53:07.9644910Z * [new branch] gh/swolchok/865/head -> origin/gh/swolchok/865/head 2025-12-04T12:53:07.9644980Z * [new branch] gh/swolchok/865/orig -> origin/gh/swolchok/865/orig 2025-12-04T12:53:07.9645049Z * [new branch] gh/swolchok/866/base -> origin/gh/swolchok/866/base 2025-12-04T12:53:07.9645146Z * [new branch] gh/swolchok/866/head -> origin/gh/swolchok/866/head 2025-12-04T12:53:07.9645215Z * [new branch] gh/swolchok/866/orig -> origin/gh/swolchok/866/orig 2025-12-04T12:53:07.9645284Z * [new branch] gh/swolchok/867/base -> origin/gh/swolchok/867/base 2025-12-04T12:53:07.9645355Z * [new branch] gh/swolchok/867/head -> origin/gh/swolchok/867/head 2025-12-04T12:53:07.9645424Z * [new branch] gh/swolchok/867/orig -> origin/gh/swolchok/867/orig 2025-12-04T12:53:07.9645493Z * [new branch] gh/swolchok/868/base -> origin/gh/swolchok/868/base 2025-12-04T12:53:07.9645565Z * [new branch] gh/swolchok/868/head -> origin/gh/swolchok/868/head 2025-12-04T12:53:07.9645636Z * [new branch] gh/swolchok/868/orig -> origin/gh/swolchok/868/orig 2025-12-04T12:53:07.9645705Z * [new branch] gh/swolchok/869/base -> origin/gh/swolchok/869/base 2025-12-04T12:53:07.9645777Z * [new branch] gh/swolchok/869/head -> origin/gh/swolchok/869/head 2025-12-04T12:53:07.9645846Z * [new branch] gh/swolchok/869/orig -> origin/gh/swolchok/869/orig 2025-12-04T12:53:07.9645915Z * [new branch] gh/swolchok/870/base -> origin/gh/swolchok/870/base 2025-12-04T12:53:07.9646018Z * [new branch] gh/swolchok/870/head -> origin/gh/swolchok/870/head 2025-12-04T12:53:07.9646089Z * [new branch] gh/swolchok/870/orig -> origin/gh/swolchok/870/orig 2025-12-04T12:53:07.9646162Z * [new branch] gh/swolchok/871/base -> origin/gh/swolchok/871/base 2025-12-04T12:53:07.9646232Z * [new branch] gh/swolchok/871/head -> origin/gh/swolchok/871/head 2025-12-04T12:53:07.9646302Z * [new branch] gh/swolchok/871/orig -> origin/gh/swolchok/871/orig 2025-12-04T12:53:07.9646376Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-12-04T12:53:07.9646445Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-12-04T12:53:07.9646512Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-12-04T12:53:07.9646583Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-12-04T12:53:07.9646652Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-12-04T12:53:07.9646719Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-12-04T12:53:07.9646787Z * [new branch] gh/tianyu-l/3/base -> origin/gh/tianyu-l/3/base 2025-12-04T12:53:07.9646853Z * [new branch] gh/tianyu-l/3/orig -> origin/gh/tianyu-l/3/orig 2025-12-04T12:53:07.9646920Z * [new branch] gh/tianyu-l/4/base -> origin/gh/tianyu-l/4/base 2025-12-04T12:53:07.9646989Z * [new branch] gh/tianyu-l/4/head -> origin/gh/tianyu-l/4/head 2025-12-04T12:53:07.9647058Z * [new branch] gh/tianyu-l/4/orig -> origin/gh/tianyu-l/4/orig 2025-12-04T12:53:07.9647147Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-12-04T12:53:07.9647232Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-12-04T12:53:07.9647319Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-12-04T12:53:07.9647404Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-12-04T12:53:07.9647489Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-12-04T12:53:07.9647571Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-12-04T12:53:07.9647654Z * [new branch] gh/tugsbayasgalan/17/base -> origin/gh/tugsbayasgalan/17/base 2025-12-04T12:53:07.9647764Z * [new branch] gh/tugsbayasgalan/17/head -> origin/gh/tugsbayasgalan/17/head 2025-12-04T12:53:07.9647845Z * [new branch] gh/tugsbayasgalan/17/orig -> origin/gh/tugsbayasgalan/17/orig 2025-12-04T12:53:07.9647930Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-12-04T12:53:07.9648012Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-12-04T12:53:07.9648091Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-12-04T12:53:07.9648174Z * [new branch] gh/tugsbayasgalan/28/base -> origin/gh/tugsbayasgalan/28/base 2025-12-04T12:53:07.9648255Z * [new branch] gh/tugsbayasgalan/28/head -> origin/gh/tugsbayasgalan/28/head 2025-12-04T12:53:07.9648336Z * [new branch] gh/tugsbayasgalan/28/orig -> origin/gh/tugsbayasgalan/28/orig 2025-12-04T12:53:07.9648420Z * [new branch] gh/tugsbayasgalan/32/base -> origin/gh/tugsbayasgalan/32/base 2025-12-04T12:53:07.9648501Z * [new branch] gh/tugsbayasgalan/32/head -> origin/gh/tugsbayasgalan/32/head 2025-12-04T12:53:07.9648583Z * [new branch] gh/tugsbayasgalan/32/orig -> origin/gh/tugsbayasgalan/32/orig 2025-12-04T12:53:07.9648664Z * [new branch] gh/tugsbayasgalan/35/base -> origin/gh/tugsbayasgalan/35/base 2025-12-04T12:53:07.9648773Z * [new branch] gh/tugsbayasgalan/35/head -> origin/gh/tugsbayasgalan/35/head 2025-12-04T12:53:07.9648855Z * [new branch] gh/tugsbayasgalan/35/orig -> origin/gh/tugsbayasgalan/35/orig 2025-12-04T12:53:07.9648936Z * [new branch] gh/tugsbayasgalan/36/base -> origin/gh/tugsbayasgalan/36/base 2025-12-04T12:53:07.9649016Z * [new branch] gh/tugsbayasgalan/36/head -> origin/gh/tugsbayasgalan/36/head 2025-12-04T12:53:07.9649097Z * [new branch] gh/tugsbayasgalan/36/orig -> origin/gh/tugsbayasgalan/36/orig 2025-12-04T12:53:07.9649180Z * [new branch] gh/tugsbayasgalan/37/base -> origin/gh/tugsbayasgalan/37/base 2025-12-04T12:53:07.9649260Z * [new branch] gh/tugsbayasgalan/37/head -> origin/gh/tugsbayasgalan/37/head 2025-12-04T12:53:07.9649342Z * [new branch] gh/tugsbayasgalan/37/orig -> origin/gh/tugsbayasgalan/37/orig 2025-12-04T12:53:07.9649423Z * [new branch] gh/tugsbayasgalan/43/base -> origin/gh/tugsbayasgalan/43/base 2025-12-04T12:53:07.9649503Z * [new branch] gh/tugsbayasgalan/43/head -> origin/gh/tugsbayasgalan/43/head 2025-12-04T12:53:07.9649585Z * [new branch] gh/tugsbayasgalan/43/orig -> origin/gh/tugsbayasgalan/43/orig 2025-12-04T12:53:07.9649665Z * [new branch] gh/tugsbayasgalan/48/base -> origin/gh/tugsbayasgalan/48/base 2025-12-04T12:53:07.9649747Z * [new branch] gh/tugsbayasgalan/48/head -> origin/gh/tugsbayasgalan/48/head 2025-12-04T12:53:07.9649830Z * [new branch] gh/tugsbayasgalan/48/orig -> origin/gh/tugsbayasgalan/48/orig 2025-12-04T12:53:07.9649911Z * [new branch] gh/tugsbayasgalan/51/base -> origin/gh/tugsbayasgalan/51/base 2025-12-04T12:53:07.9649991Z * [new branch] gh/tugsbayasgalan/51/head -> origin/gh/tugsbayasgalan/51/head 2025-12-04T12:53:07.9650075Z * [new branch] gh/tugsbayasgalan/51/orig -> origin/gh/tugsbayasgalan/51/orig 2025-12-04T12:53:07.9650155Z * [new branch] gh/tugsbayasgalan/52/base -> origin/gh/tugsbayasgalan/52/base 2025-12-04T12:53:07.9650272Z * [new branch] gh/tugsbayasgalan/52/head -> origin/gh/tugsbayasgalan/52/head 2025-12-04T12:53:07.9650356Z * [new branch] gh/tugsbayasgalan/52/orig -> origin/gh/tugsbayasgalan/52/orig 2025-12-04T12:53:07.9650437Z * [new branch] gh/tugsbayasgalan/53/base -> origin/gh/tugsbayasgalan/53/base 2025-12-04T12:53:07.9650517Z * [new branch] gh/tugsbayasgalan/53/head -> origin/gh/tugsbayasgalan/53/head 2025-12-04T12:53:07.9650655Z * [new branch] gh/tugsbayasgalan/53/orig -> origin/gh/tugsbayasgalan/53/orig 2025-12-04T12:53:07.9650738Z * [new branch] gh/tugsbayasgalan/55/base -> origin/gh/tugsbayasgalan/55/base 2025-12-04T12:53:07.9650819Z * [new branch] gh/tugsbayasgalan/55/head -> origin/gh/tugsbayasgalan/55/head 2025-12-04T12:53:07.9650900Z * [new branch] gh/tugsbayasgalan/55/orig -> origin/gh/tugsbayasgalan/55/orig 2025-12-04T12:53:07.9650981Z * [new branch] gh/tugsbayasgalan/59/base -> origin/gh/tugsbayasgalan/59/base 2025-12-04T12:53:07.9651064Z * [new branch] gh/tugsbayasgalan/59/head -> origin/gh/tugsbayasgalan/59/head 2025-12-04T12:53:07.9651145Z * [new branch] gh/tugsbayasgalan/59/orig -> origin/gh/tugsbayasgalan/59/orig 2025-12-04T12:53:07.9651224Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-12-04T12:53:07.9651307Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-12-04T12:53:07.9651385Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-12-04T12:53:07.9651466Z * [new branch] gh/tugsbayasgalan/60/base -> origin/gh/tugsbayasgalan/60/base 2025-12-04T12:53:07.9651598Z * [new branch] gh/tugsbayasgalan/60/head -> origin/gh/tugsbayasgalan/60/head 2025-12-04T12:53:07.9651679Z * [new branch] gh/tugsbayasgalan/60/orig -> origin/gh/tugsbayasgalan/60/orig 2025-12-04T12:53:07.9651759Z * [new branch] gh/tugsbayasgalan/61/base -> origin/gh/tugsbayasgalan/61/base 2025-12-04T12:53:07.9651841Z * [new branch] gh/tugsbayasgalan/61/head -> origin/gh/tugsbayasgalan/61/head 2025-12-04T12:53:07.9651922Z * [new branch] gh/tugsbayasgalan/61/orig -> origin/gh/tugsbayasgalan/61/orig 2025-12-04T12:53:07.9652006Z * [new branch] gh/tugsbayasgalan/63/base -> origin/gh/tugsbayasgalan/63/base 2025-12-04T12:53:07.9652088Z * [new branch] gh/tugsbayasgalan/63/head -> origin/gh/tugsbayasgalan/63/head 2025-12-04T12:53:07.9652169Z * [new branch] gh/tugsbayasgalan/63/orig -> origin/gh/tugsbayasgalan/63/orig 2025-12-04T12:53:07.9652250Z * [new branch] gh/tugsbayasgalan/67/base -> origin/gh/tugsbayasgalan/67/base 2025-12-04T12:53:07.9652334Z * [new branch] gh/tugsbayasgalan/67/head -> origin/gh/tugsbayasgalan/67/head 2025-12-04T12:53:07.9652414Z * [new branch] gh/tugsbayasgalan/67/orig -> origin/gh/tugsbayasgalan/67/orig 2025-12-04T12:53:07.9652497Z * [new branch] gh/tugsbayasgalan/68/base -> origin/gh/tugsbayasgalan/68/base 2025-12-04T12:53:07.9652577Z * [new branch] gh/tugsbayasgalan/68/head -> origin/gh/tugsbayasgalan/68/head 2025-12-04T12:53:07.9652658Z * [new branch] gh/tugsbayasgalan/68/orig -> origin/gh/tugsbayasgalan/68/orig 2025-12-04T12:53:07.9652740Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-12-04T12:53:07.9652821Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-12-04T12:53:07.9652899Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-12-04T12:53:07.9652984Z * [new branch] gh/tugsbayasgalan/70/base -> origin/gh/tugsbayasgalan/70/base 2025-12-04T12:53:07.9653064Z * [new branch] gh/tugsbayasgalan/70/head -> origin/gh/tugsbayasgalan/70/head 2025-12-04T12:53:07.9653147Z * [new branch] gh/tugsbayasgalan/70/orig -> origin/gh/tugsbayasgalan/70/orig 2025-12-04T12:53:07.9653228Z * [new branch] gh/tugsbayasgalan/71/base -> origin/gh/tugsbayasgalan/71/base 2025-12-04T12:53:07.9653308Z * [new branch] gh/tugsbayasgalan/71/head -> origin/gh/tugsbayasgalan/71/head 2025-12-04T12:53:07.9653416Z * [new branch] gh/tugsbayasgalan/71/orig -> origin/gh/tugsbayasgalan/71/orig 2025-12-04T12:53:07.9653498Z * [new branch] gh/tugsbayasgalan/72/base -> origin/gh/tugsbayasgalan/72/base 2025-12-04T12:53:07.9653580Z * [new branch] gh/tugsbayasgalan/72/head -> origin/gh/tugsbayasgalan/72/head 2025-12-04T12:53:07.9653662Z * [new branch] gh/tugsbayasgalan/72/orig -> origin/gh/tugsbayasgalan/72/orig 2025-12-04T12:53:07.9653744Z * [new branch] gh/tugsbayasgalan/73/base -> origin/gh/tugsbayasgalan/73/base 2025-12-04T12:53:07.9653824Z * [new branch] gh/tugsbayasgalan/73/head -> origin/gh/tugsbayasgalan/73/head 2025-12-04T12:53:07.9653905Z * [new branch] gh/tugsbayasgalan/73/orig -> origin/gh/tugsbayasgalan/73/orig 2025-12-04T12:53:07.9653986Z * [new branch] gh/tugsbayasgalan/74/base -> origin/gh/tugsbayasgalan/74/base 2025-12-04T12:53:07.9654068Z * [new branch] gh/tugsbayasgalan/74/head -> origin/gh/tugsbayasgalan/74/head 2025-12-04T12:53:07.9654151Z * [new branch] gh/tugsbayasgalan/74/orig -> origin/gh/tugsbayasgalan/74/orig 2025-12-04T12:53:07.9654233Z * [new branch] gh/tugsbayasgalan/75/base -> origin/gh/tugsbayasgalan/75/base 2025-12-04T12:53:07.9654313Z * [new branch] gh/tugsbayasgalan/75/head -> origin/gh/tugsbayasgalan/75/head 2025-12-04T12:53:07.9654419Z * [new branch] gh/tugsbayasgalan/75/orig -> origin/gh/tugsbayasgalan/75/orig 2025-12-04T12:53:07.9654500Z * [new branch] gh/tugsbayasgalan/76/base -> origin/gh/tugsbayasgalan/76/base 2025-12-04T12:53:07.9654580Z * [new branch] gh/tugsbayasgalan/76/head -> origin/gh/tugsbayasgalan/76/head 2025-12-04T12:53:07.9654662Z * [new branch] gh/tugsbayasgalan/76/orig -> origin/gh/tugsbayasgalan/76/orig 2025-12-04T12:53:07.9654744Z * [new branch] gh/tugsbayasgalan/77/base -> origin/gh/tugsbayasgalan/77/base 2025-12-04T12:53:07.9654826Z * [new branch] gh/tugsbayasgalan/77/head -> origin/gh/tugsbayasgalan/77/head 2025-12-04T12:53:07.9654907Z * [new branch] gh/tugsbayasgalan/77/orig -> origin/gh/tugsbayasgalan/77/orig 2025-12-04T12:53:07.9654987Z * [new branch] gh/tugsbayasgalan/78/base -> origin/gh/tugsbayasgalan/78/base 2025-12-04T12:53:07.9655070Z * [new branch] gh/tugsbayasgalan/78/head -> origin/gh/tugsbayasgalan/78/head 2025-12-04T12:53:07.9655153Z * [new branch] gh/tugsbayasgalan/78/orig -> origin/gh/tugsbayasgalan/78/orig 2025-12-04T12:53:07.9655235Z * [new branch] gh/tugsbayasgalan/79/base -> origin/gh/tugsbayasgalan/79/base 2025-12-04T12:53:07.9655316Z * [new branch] gh/tugsbayasgalan/79/head -> origin/gh/tugsbayasgalan/79/head 2025-12-04T12:53:07.9655397Z * [new branch] gh/tugsbayasgalan/79/orig -> origin/gh/tugsbayasgalan/79/orig 2025-12-04T12:53:07.9655478Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-12-04T12:53:07.9655558Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-12-04T12:53:07.9655638Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-12-04T12:53:07.9655721Z * [new branch] gh/tugsbayasgalan/80/base -> origin/gh/tugsbayasgalan/80/base 2025-12-04T12:53:07.9655804Z * [new branch] gh/tugsbayasgalan/80/head -> origin/gh/tugsbayasgalan/80/head 2025-12-04T12:53:07.9655886Z * [new branch] gh/tugsbayasgalan/80/orig -> origin/gh/tugsbayasgalan/80/orig 2025-12-04T12:53:07.9655967Z * [new branch] gh/tugsbayasgalan/81/base -> origin/gh/tugsbayasgalan/81/base 2025-12-04T12:53:07.9656049Z * [new branch] gh/tugsbayasgalan/81/head -> origin/gh/tugsbayasgalan/81/head 2025-12-04T12:53:07.9656129Z * [new branch] gh/tugsbayasgalan/81/orig -> origin/gh/tugsbayasgalan/81/orig 2025-12-04T12:53:07.9656237Z * [new branch] gh/tugsbayasgalan/82/base -> origin/gh/tugsbayasgalan/82/base 2025-12-04T12:53:07.9656319Z * [new branch] gh/tugsbayasgalan/82/head -> origin/gh/tugsbayasgalan/82/head 2025-12-04T12:53:07.9656399Z * [new branch] gh/tugsbayasgalan/82/orig -> origin/gh/tugsbayasgalan/82/orig 2025-12-04T12:53:07.9656481Z * [new branch] gh/tugsbayasgalan/83/base -> origin/gh/tugsbayasgalan/83/base 2025-12-04T12:53:07.9656567Z * [new branch] gh/tugsbayasgalan/83/head -> origin/gh/tugsbayasgalan/83/head 2025-12-04T12:53:07.9656648Z * [new branch] gh/tugsbayasgalan/83/orig -> origin/gh/tugsbayasgalan/83/orig 2025-12-04T12:53:07.9656729Z * [new branch] gh/tugsbayasgalan/84/base -> origin/gh/tugsbayasgalan/84/base 2025-12-04T12:53:07.9656812Z * [new branch] gh/tugsbayasgalan/84/head -> origin/gh/tugsbayasgalan/84/head 2025-12-04T12:53:07.9656894Z * [new branch] gh/tugsbayasgalan/84/orig -> origin/gh/tugsbayasgalan/84/orig 2025-12-04T12:53:07.9656974Z * [new branch] gh/tugsbayasgalan/85/base -> origin/gh/tugsbayasgalan/85/base 2025-12-04T12:53:07.9657056Z * [new branch] gh/tugsbayasgalan/85/head -> origin/gh/tugsbayasgalan/85/head 2025-12-04T12:53:07.9657161Z * [new branch] gh/tugsbayasgalan/85/orig -> origin/gh/tugsbayasgalan/85/orig 2025-12-04T12:53:07.9657243Z * [new branch] gh/tugsbayasgalan/86/base -> origin/gh/tugsbayasgalan/86/base 2025-12-04T12:53:07.9657323Z * [new branch] gh/tugsbayasgalan/86/head -> origin/gh/tugsbayasgalan/86/head 2025-12-04T12:53:07.9657404Z * [new branch] gh/tugsbayasgalan/86/orig -> origin/gh/tugsbayasgalan/86/orig 2025-12-04T12:53:07.9657485Z * [new branch] gh/tugsbayasgalan/87/base -> origin/gh/tugsbayasgalan/87/base 2025-12-04T12:53:07.9657566Z * [new branch] gh/tugsbayasgalan/87/head -> origin/gh/tugsbayasgalan/87/head 2025-12-04T12:53:07.9657647Z * [new branch] gh/tugsbayasgalan/87/orig -> origin/gh/tugsbayasgalan/87/orig 2025-12-04T12:53:07.9657729Z * [new branch] gh/tugsbayasgalan/88/base -> origin/gh/tugsbayasgalan/88/base 2025-12-04T12:53:07.9657810Z * [new branch] gh/tugsbayasgalan/88/head -> origin/gh/tugsbayasgalan/88/head 2025-12-04T12:53:07.9657891Z * [new branch] gh/tugsbayasgalan/88/orig -> origin/gh/tugsbayasgalan/88/orig 2025-12-04T12:53:07.9657972Z * [new branch] gh/tugsbayasgalan/89/base -> origin/gh/tugsbayasgalan/89/base 2025-12-04T12:53:07.9658052Z * [new branch] gh/tugsbayasgalan/89/head -> origin/gh/tugsbayasgalan/89/head 2025-12-04T12:53:07.9658132Z * [new branch] gh/tugsbayasgalan/89/orig -> origin/gh/tugsbayasgalan/89/orig 2025-12-04T12:53:07.9658212Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-12-04T12:53:07.9658294Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-12-04T12:53:07.9658373Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-12-04T12:53:07.9658456Z * [new branch] gh/tugsbayasgalan/90/base -> origin/gh/tugsbayasgalan/90/base 2025-12-04T12:53:07.9658538Z * [new branch] gh/tugsbayasgalan/90/head -> origin/gh/tugsbayasgalan/90/head 2025-12-04T12:53:07.9658619Z * [new branch] gh/tugsbayasgalan/90/orig -> origin/gh/tugsbayasgalan/90/orig 2025-12-04T12:53:07.9658700Z * [new branch] gh/tugsbayasgalan/91/base -> origin/gh/tugsbayasgalan/91/base 2025-12-04T12:53:07.9658781Z * [new branch] gh/tugsbayasgalan/91/head -> origin/gh/tugsbayasgalan/91/head 2025-12-04T12:53:07.9658864Z * [new branch] gh/tugsbayasgalan/91/orig -> origin/gh/tugsbayasgalan/91/orig 2025-12-04T12:53:07.9658944Z * [new branch] gh/tugsbayasgalan/92/base -> origin/gh/tugsbayasgalan/92/base 2025-12-04T12:53:07.9659052Z * [new branch] gh/tugsbayasgalan/92/head -> origin/gh/tugsbayasgalan/92/head 2025-12-04T12:53:07.9659135Z * [new branch] gh/tugsbayasgalan/92/orig -> origin/gh/tugsbayasgalan/92/orig 2025-12-04T12:53:07.9659217Z * [new branch] gh/tugsbayasgalan/93/base -> origin/gh/tugsbayasgalan/93/base 2025-12-04T12:53:07.9659299Z * [new branch] gh/tugsbayasgalan/93/head -> origin/gh/tugsbayasgalan/93/head 2025-12-04T12:53:07.9659379Z * [new branch] gh/tugsbayasgalan/93/orig -> origin/gh/tugsbayasgalan/93/orig 2025-12-04T12:53:07.9659446Z * [new branch] gh/v0i0/14/base -> origin/gh/v0i0/14/base 2025-12-04T12:53:07.9659511Z * [new branch] gh/v0i0/14/head -> origin/gh/v0i0/14/head 2025-12-04T12:53:07.9659574Z * [new branch] gh/v0i0/14/orig -> origin/gh/v0i0/14/orig 2025-12-04T12:53:07.9659639Z * [new branch] gh/v0i0/15/base -> origin/gh/v0i0/15/base 2025-12-04T12:53:07.9659701Z * [new branch] gh/v0i0/15/head -> origin/gh/v0i0/15/head 2025-12-04T12:53:07.9659763Z * [new branch] gh/v0i0/15/orig -> origin/gh/v0i0/15/orig 2025-12-04T12:53:07.9659824Z * [new branch] gh/v0i0/16/base -> origin/gh/v0i0/16/base 2025-12-04T12:53:07.9659915Z * [new branch] gh/v0i0/16/head -> origin/gh/v0i0/16/head 2025-12-04T12:53:07.9659978Z * [new branch] gh/v0i0/16/orig -> origin/gh/v0i0/16/orig 2025-12-04T12:53:07.9660039Z * [new branch] gh/v0i0/17/base -> origin/gh/v0i0/17/base 2025-12-04T12:53:07.9660101Z * [new branch] gh/v0i0/17/head -> origin/gh/v0i0/17/head 2025-12-04T12:53:07.9660163Z * [new branch] gh/v0i0/17/orig -> origin/gh/v0i0/17/orig 2025-12-04T12:53:07.9660262Z * [new branch] gh/v0i0/18/base -> origin/gh/v0i0/18/base 2025-12-04T12:53:07.9660331Z * [new branch] gh/v0i0/18/head -> origin/gh/v0i0/18/head 2025-12-04T12:53:07.9660392Z * [new branch] gh/v0i0/18/orig -> origin/gh/v0i0/18/orig 2025-12-04T12:53:07.9660453Z * [new branch] gh/v0i0/19/base -> origin/gh/v0i0/19/base 2025-12-04T12:53:07.9660518Z * [new branch] gh/v0i0/19/head -> origin/gh/v0i0/19/head 2025-12-04T12:53:07.9660579Z * [new branch] gh/v0i0/19/orig -> origin/gh/v0i0/19/orig 2025-12-04T12:53:07.9660658Z * [new branch] gh/vishal9-team/1/base -> origin/gh/vishal9-team/1/base 2025-12-04T12:53:07.9660733Z * [new branch] gh/vishal9-team/1/head -> origin/gh/vishal9-team/1/head 2025-12-04T12:53:07.9660806Z * [new branch] gh/vishal9-team/2/base -> origin/gh/vishal9-team/2/base 2025-12-04T12:53:07.9660879Z * [new branch] gh/vishal9-team/2/head -> origin/gh/vishal9-team/2/head 2025-12-04T12:53:07.9660955Z * [new branch] gh/vishal9-team/2/orig -> origin/gh/vishal9-team/2/orig 2025-12-04T12:53:07.9661028Z * [new branch] gh/vishal9-team/3/base -> origin/gh/vishal9-team/3/base 2025-12-04T12:53:07.9661099Z * [new branch] gh/vishal9-team/3/head -> origin/gh/vishal9-team/3/head 2025-12-04T12:53:07.9661174Z * [new branch] gh/vishal9-team/3/orig -> origin/gh/vishal9-team/3/orig 2025-12-04T12:53:07.9661248Z * [new branch] gh/vishal9-team/4/base -> origin/gh/vishal9-team/4/base 2025-12-04T12:53:07.9661320Z * [new branch] gh/vishal9-team/4/head -> origin/gh/vishal9-team/4/head 2025-12-04T12:53:07.9661393Z * [new branch] gh/vishal9-team/4/orig -> origin/gh/vishal9-team/4/orig 2025-12-04T12:53:07.9661457Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-12-04T12:53:07.9661570Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-12-04T12:53:07.9661637Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-12-04T12:53:07.9661710Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-12-04T12:53:07.9661782Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-12-04T12:53:07.9661855Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-12-04T12:53:07.9661925Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-12-04T12:53:07.9661996Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-12-04T12:53:07.9662065Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-12-04T12:53:07.9662134Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-12-04T12:53:07.9662206Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-12-04T12:53:07.9662275Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-12-04T12:53:07.9662345Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-12-04T12:53:07.9662415Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-12-04T12:53:07.9662532Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-12-04T12:53:07.9662602Z * [new branch] gh/wconstab/448/base -> origin/gh/wconstab/448/base 2025-12-04T12:53:07.9662671Z * [new branch] gh/wconstab/448/head -> origin/gh/wconstab/448/head 2025-12-04T12:53:07.9662740Z * [new branch] gh/wconstab/448/orig -> origin/gh/wconstab/448/orig 2025-12-04T12:53:07.9662811Z * [new branch] gh/wconstab/449/base -> origin/gh/wconstab/449/base 2025-12-04T12:53:07.9662883Z * [new branch] gh/wconstab/449/head -> origin/gh/wconstab/449/head 2025-12-04T12:53:07.9662952Z * [new branch] gh/wconstab/449/orig -> origin/gh/wconstab/449/orig 2025-12-04T12:53:07.9663020Z * [new branch] gh/wconstab/450/base -> origin/gh/wconstab/450/base 2025-12-04T12:53:07.9663091Z * [new branch] gh/wconstab/450/head -> origin/gh/wconstab/450/head 2025-12-04T12:53:07.9663162Z * [new branch] gh/wconstab/450/orig -> origin/gh/wconstab/450/orig 2025-12-04T12:53:07.9663231Z * [new branch] gh/wconstab/451/base -> origin/gh/wconstab/451/base 2025-12-04T12:53:07.9663301Z * [new branch] gh/wconstab/451/head -> origin/gh/wconstab/451/head 2025-12-04T12:53:07.9663371Z * [new branch] gh/wconstab/451/orig -> origin/gh/wconstab/451/orig 2025-12-04T12:53:07.9663441Z * [new branch] gh/wconstab/452/base -> origin/gh/wconstab/452/base 2025-12-04T12:53:07.9663513Z * [new branch] gh/wconstab/452/head -> origin/gh/wconstab/452/head 2025-12-04T12:53:07.9663582Z * [new branch] gh/wconstab/452/orig -> origin/gh/wconstab/452/orig 2025-12-04T12:53:07.9663653Z * [new branch] gh/wconstab/453/base -> origin/gh/wconstab/453/base 2025-12-04T12:53:07.9663723Z * [new branch] gh/wconstab/453/head -> origin/gh/wconstab/453/head 2025-12-04T12:53:07.9663795Z * [new branch] gh/wconstab/453/orig -> origin/gh/wconstab/453/orig 2025-12-04T12:53:07.9663865Z * [new branch] gh/wconstab/454/base -> origin/gh/wconstab/454/base 2025-12-04T12:53:07.9663934Z * [new branch] gh/wconstab/454/head -> origin/gh/wconstab/454/head 2025-12-04T12:53:07.9664004Z * [new branch] gh/wconstab/454/orig -> origin/gh/wconstab/454/orig 2025-12-04T12:53:07.9664074Z * [new branch] gh/wconstab/455/base -> origin/gh/wconstab/455/base 2025-12-04T12:53:07.9664173Z * [new branch] gh/wconstab/455/head -> origin/gh/wconstab/455/head 2025-12-04T12:53:07.9664244Z * [new branch] gh/wconstab/455/orig -> origin/gh/wconstab/455/orig 2025-12-04T12:53:07.9664315Z * [new branch] gh/wconstab/456/base -> origin/gh/wconstab/456/base 2025-12-04T12:53:07.9664387Z * [new branch] gh/wconstab/456/head -> origin/gh/wconstab/456/head 2025-12-04T12:53:07.9664458Z * [new branch] gh/wconstab/456/orig -> origin/gh/wconstab/456/orig 2025-12-04T12:53:07.9664529Z * [new branch] gh/wconstab/457/base -> origin/gh/wconstab/457/base 2025-12-04T12:53:07.9664598Z * [new branch] gh/wconstab/457/head -> origin/gh/wconstab/457/head 2025-12-04T12:53:07.9664667Z * [new branch] gh/wconstab/457/orig -> origin/gh/wconstab/457/orig 2025-12-04T12:53:07.9664736Z * [new branch] gh/wconstab/458/base -> origin/gh/wconstab/458/base 2025-12-04T12:53:07.9664807Z * [new branch] gh/wconstab/458/head -> origin/gh/wconstab/458/head 2025-12-04T12:53:07.9664876Z * [new branch] gh/wconstab/458/orig -> origin/gh/wconstab/458/orig 2025-12-04T12:53:07.9664946Z * [new branch] gh/wconstab/459/base -> origin/gh/wconstab/459/base 2025-12-04T12:53:07.9665042Z * [new branch] gh/wconstab/459/head -> origin/gh/wconstab/459/head 2025-12-04T12:53:07.9665113Z * [new branch] gh/wconstab/459/orig -> origin/gh/wconstab/459/orig 2025-12-04T12:53:07.9665182Z * [new branch] gh/wconstab/460/base -> origin/gh/wconstab/460/base 2025-12-04T12:53:07.9665252Z * [new branch] gh/wconstab/460/head -> origin/gh/wconstab/460/head 2025-12-04T12:53:07.9665321Z * [new branch] gh/wconstab/460/orig -> origin/gh/wconstab/460/orig 2025-12-04T12:53:07.9665391Z * [new branch] gh/wconstab/461/base -> origin/gh/wconstab/461/base 2025-12-04T12:53:07.9665462Z * [new branch] gh/wconstab/461/head -> origin/gh/wconstab/461/head 2025-12-04T12:53:07.9665532Z * [new branch] gh/wconstab/461/orig -> origin/gh/wconstab/461/orig 2025-12-04T12:53:07.9665602Z * [new branch] gh/wconstab/462/base -> origin/gh/wconstab/462/base 2025-12-04T12:53:07.9665672Z * [new branch] gh/wconstab/462/head -> origin/gh/wconstab/462/head 2025-12-04T12:53:07.9665742Z * [new branch] gh/wconstab/462/orig -> origin/gh/wconstab/462/orig 2025-12-04T12:53:07.9665813Z * [new branch] gh/wconstab/463/base -> origin/gh/wconstab/463/base 2025-12-04T12:53:07.9665882Z * [new branch] gh/wconstab/463/head -> origin/gh/wconstab/463/head 2025-12-04T12:53:07.9665953Z * [new branch] gh/wconstab/463/orig -> origin/gh/wconstab/463/orig 2025-12-04T12:53:07.9666022Z * [new branch] gh/wconstab/464/base -> origin/gh/wconstab/464/base 2025-12-04T12:53:07.9666094Z * [new branch] gh/wconstab/464/head -> origin/gh/wconstab/464/head 2025-12-04T12:53:07.9666164Z * [new branch] gh/wconstab/464/orig -> origin/gh/wconstab/464/orig 2025-12-04T12:53:07.9666233Z * [new branch] gh/wconstab/465/base -> origin/gh/wconstab/465/base 2025-12-04T12:53:07.9666303Z * [new branch] gh/wconstab/465/head -> origin/gh/wconstab/465/head 2025-12-04T12:53:07.9666373Z * [new branch] gh/wconstab/465/orig -> origin/gh/wconstab/465/orig 2025-12-04T12:53:07.9666442Z * [new branch] gh/wconstab/466/base -> origin/gh/wconstab/466/base 2025-12-04T12:53:07.9666513Z * [new branch] gh/wconstab/466/head -> origin/gh/wconstab/466/head 2025-12-04T12:53:07.9666582Z * [new branch] gh/wconstab/466/orig -> origin/gh/wconstab/466/orig 2025-12-04T12:53:07.9666652Z * [new branch] gh/wconstab/467/base -> origin/gh/wconstab/467/base 2025-12-04T12:53:07.9666756Z * [new branch] gh/wconstab/467/head -> origin/gh/wconstab/467/head 2025-12-04T12:53:07.9666827Z * [new branch] gh/wconstab/467/orig -> origin/gh/wconstab/467/orig 2025-12-04T12:53:07.9666896Z * [new branch] gh/wconstab/468/base -> origin/gh/wconstab/468/base 2025-12-04T12:53:07.9666970Z * [new branch] gh/wconstab/468/head -> origin/gh/wconstab/468/head 2025-12-04T12:53:07.9667039Z * [new branch] gh/wconstab/468/orig -> origin/gh/wconstab/468/orig 2025-12-04T12:53:07.9667110Z * [new branch] gh/weifengpy/39/base -> origin/gh/weifengpy/39/base 2025-12-04T12:53:07.9667182Z * [new branch] gh/weifengpy/39/head -> origin/gh/weifengpy/39/head 2025-12-04T12:53:07.9667253Z * [new branch] gh/weifengpy/39/orig -> origin/gh/weifengpy/39/orig 2025-12-04T12:53:07.9667323Z * [new branch] gh/weifengpy/40/base -> origin/gh/weifengpy/40/base 2025-12-04T12:53:07.9667395Z * [new branch] gh/weifengpy/40/head -> origin/gh/weifengpy/40/head 2025-12-04T12:53:07.9667465Z * [new branch] gh/weifengpy/40/orig -> origin/gh/weifengpy/40/orig 2025-12-04T12:53:07.9667535Z * [new branch] gh/weifengpy/41/base -> origin/gh/weifengpy/41/base 2025-12-04T12:53:07.9667637Z * [new branch] gh/weifengpy/41/head -> origin/gh/weifengpy/41/head 2025-12-04T12:53:07.9667708Z * [new branch] gh/weifengpy/41/orig -> origin/gh/weifengpy/41/orig 2025-12-04T12:53:07.9667789Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-12-04T12:53:07.9667868Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-12-04T12:53:07.9667946Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-12-04T12:53:07.9668029Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-12-04T12:53:07.9668105Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-12-04T12:53:07.9668180Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-12-04T12:53:07.9668259Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-12-04T12:53:07.9668336Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-12-04T12:53:07.9668412Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-12-04T12:53:07.9668489Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-12-04T12:53:07.9668567Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-12-04T12:53:07.9668643Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-12-04T12:53:07.9668722Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-12-04T12:53:07.9668798Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-12-04T12:53:07.9668874Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-12-04T12:53:07.9668954Z * [new branch] gh/williamwen42/296/base -> origin/gh/williamwen42/296/base 2025-12-04T12:53:07.9669032Z * [new branch] gh/williamwen42/296/head -> origin/gh/williamwen42/296/head 2025-12-04T12:53:07.9669109Z * [new branch] gh/williamwen42/296/orig -> origin/gh/williamwen42/296/orig 2025-12-04T12:53:07.9669186Z * [new branch] gh/williamwen42/297/base -> origin/gh/williamwen42/297/base 2025-12-04T12:53:07.9669262Z * [new branch] gh/williamwen42/297/head -> origin/gh/williamwen42/297/head 2025-12-04T12:53:07.9669338Z * [new branch] gh/williamwen42/297/orig -> origin/gh/williamwen42/297/orig 2025-12-04T12:53:07.9669446Z * [new branch] gh/williamwen42/306/base -> origin/gh/williamwen42/306/base 2025-12-04T12:53:07.9669522Z * [new branch] gh/williamwen42/306/head -> origin/gh/williamwen42/306/head 2025-12-04T12:53:07.9669600Z * [new branch] gh/williamwen42/306/orig -> origin/gh/williamwen42/306/orig 2025-12-04T12:53:07.9669677Z * [new branch] gh/williamwen42/309/base -> origin/gh/williamwen42/309/base 2025-12-04T12:53:07.9669752Z * [new branch] gh/williamwen42/309/head -> origin/gh/williamwen42/309/head 2025-12-04T12:53:07.9669829Z * [new branch] gh/williamwen42/309/orig -> origin/gh/williamwen42/309/orig 2025-12-04T12:53:07.9669904Z * [new branch] gh/williamwen42/310/base -> origin/gh/williamwen42/310/base 2025-12-04T12:53:07.9669980Z * [new branch] gh/williamwen42/310/head -> origin/gh/williamwen42/310/head 2025-12-04T12:53:07.9670058Z * [new branch] gh/williamwen42/310/orig -> origin/gh/williamwen42/310/orig 2025-12-04T12:53:07.9670133Z * [new branch] gh/williamwen42/311/base -> origin/gh/williamwen42/311/base 2025-12-04T12:53:07.9670272Z * [new branch] gh/williamwen42/311/head -> origin/gh/williamwen42/311/head 2025-12-04T12:53:07.9670406Z * [new branch] gh/williamwen42/311/orig -> origin/gh/williamwen42/311/orig 2025-12-04T12:53:07.9670483Z * [new branch] gh/williamwen42/319/base -> origin/gh/williamwen42/319/base 2025-12-04T12:53:07.9670559Z * [new branch] gh/williamwen42/319/head -> origin/gh/williamwen42/319/head 2025-12-04T12:53:07.9670635Z * [new branch] gh/williamwen42/319/orig -> origin/gh/williamwen42/319/orig 2025-12-04T12:53:07.9670711Z * [new branch] gh/williamwen42/325/base -> origin/gh/williamwen42/325/base 2025-12-04T12:53:07.9670786Z * [new branch] gh/williamwen42/325/head -> origin/gh/williamwen42/325/head 2025-12-04T12:53:07.9670866Z * [new branch] gh/williamwen42/325/orig -> origin/gh/williamwen42/325/orig 2025-12-04T12:53:07.9670942Z * [new branch] gh/williamwen42/326/base -> origin/gh/williamwen42/326/base 2025-12-04T12:53:07.9671021Z * [new branch] gh/williamwen42/326/head -> origin/gh/williamwen42/326/head 2025-12-04T12:53:07.9671098Z * [new branch] gh/williamwen42/326/orig -> origin/gh/williamwen42/326/orig 2025-12-04T12:53:07.9671174Z * [new branch] gh/williamwen42/327/base -> origin/gh/williamwen42/327/base 2025-12-04T12:53:07.9671252Z * [new branch] gh/williamwen42/327/head -> origin/gh/williamwen42/327/head 2025-12-04T12:53:07.9671330Z * [new branch] gh/williamwen42/327/orig -> origin/gh/williamwen42/327/orig 2025-12-04T12:53:07.9671407Z * [new branch] gh/williamwen42/328/base -> origin/gh/williamwen42/328/base 2025-12-04T12:53:07.9671486Z * [new branch] gh/williamwen42/328/head -> origin/gh/williamwen42/328/head 2025-12-04T12:53:07.9671563Z * [new branch] gh/williamwen42/328/orig -> origin/gh/williamwen42/328/orig 2025-12-04T12:53:07.9671639Z * [new branch] gh/williamwen42/329/base -> origin/gh/williamwen42/329/base 2025-12-04T12:53:07.9671717Z * [new branch] gh/williamwen42/329/head -> origin/gh/williamwen42/329/head 2025-12-04T12:53:07.9671793Z * [new branch] gh/williamwen42/329/orig -> origin/gh/williamwen42/329/orig 2025-12-04T12:53:07.9671870Z * [new branch] gh/williamwen42/330/base -> origin/gh/williamwen42/330/base 2025-12-04T12:53:07.9671950Z * [new branch] gh/williamwen42/330/head -> origin/gh/williamwen42/330/head 2025-12-04T12:53:07.9672028Z * [new branch] gh/williamwen42/330/orig -> origin/gh/williamwen42/330/orig 2025-12-04T12:53:07.9672106Z * [new branch] gh/williamwen42/331/base -> origin/gh/williamwen42/331/base 2025-12-04T12:53:07.9672247Z * [new branch] gh/williamwen42/331/head -> origin/gh/williamwen42/331/head 2025-12-04T12:53:07.9672323Z * [new branch] gh/williamwen42/331/orig -> origin/gh/williamwen42/331/orig 2025-12-04T12:53:07.9672398Z * [new branch] gh/williamwen42/332/base -> origin/gh/williamwen42/332/base 2025-12-04T12:53:07.9672477Z * [new branch] gh/williamwen42/332/head -> origin/gh/williamwen42/332/head 2025-12-04T12:53:07.9672553Z * [new branch] gh/williamwen42/332/orig -> origin/gh/williamwen42/332/orig 2025-12-04T12:53:07.9672631Z * [new branch] gh/williamwen42/333/base -> origin/gh/williamwen42/333/base 2025-12-04T12:53:07.9672709Z * [new branch] gh/williamwen42/333/head -> origin/gh/williamwen42/333/head 2025-12-04T12:53:07.9672786Z * [new branch] gh/williamwen42/333/orig -> origin/gh/williamwen42/333/orig 2025-12-04T12:53:07.9672865Z * [new branch] gh/williamwen42/334/base -> origin/gh/williamwen42/334/base 2025-12-04T12:53:07.9672942Z * [new branch] gh/williamwen42/334/head -> origin/gh/williamwen42/334/head 2025-12-04T12:53:07.9673018Z * [new branch] gh/williamwen42/334/orig -> origin/gh/williamwen42/334/orig 2025-12-04T12:53:07.9673533Z * [new branch] gh/williamwen42/335/base -> origin/gh/williamwen42/335/base 2025-12-04T12:53:07.9673611Z * [new branch] gh/williamwen42/335/head -> origin/gh/williamwen42/335/head 2025-12-04T12:53:07.9673688Z * [new branch] gh/williamwen42/335/orig -> origin/gh/williamwen42/335/orig 2025-12-04T12:53:07.9673767Z * [new branch] gh/williamwen42/336/base -> origin/gh/williamwen42/336/base 2025-12-04T12:53:07.9673843Z * [new branch] gh/williamwen42/336/head -> origin/gh/williamwen42/336/head 2025-12-04T12:53:07.9673920Z * [new branch] gh/williamwen42/336/orig -> origin/gh/williamwen42/336/orig 2025-12-04T12:53:07.9674002Z * [new branch] gh/williamwen42/337/base -> origin/gh/williamwen42/337/base 2025-12-04T12:53:07.9674078Z * [new branch] gh/williamwen42/337/head -> origin/gh/williamwen42/337/head 2025-12-04T12:53:07.9674153Z * [new branch] gh/williamwen42/337/orig -> origin/gh/williamwen42/337/orig 2025-12-04T12:53:07.9674233Z * [new branch] gh/williamwen42/338/base -> origin/gh/williamwen42/338/base 2025-12-04T12:53:07.9674309Z * [new branch] gh/williamwen42/338/head -> origin/gh/williamwen42/338/head 2025-12-04T12:53:07.9674386Z * [new branch] gh/williamwen42/338/orig -> origin/gh/williamwen42/338/orig 2025-12-04T12:53:07.9674464Z * [new branch] gh/williamwen42/339/base -> origin/gh/williamwen42/339/base 2025-12-04T12:53:07.9674541Z * [new branch] gh/williamwen42/339/head -> origin/gh/williamwen42/339/head 2025-12-04T12:53:07.9674622Z * [new branch] gh/williamwen42/339/orig -> origin/gh/williamwen42/339/orig 2025-12-04T12:53:07.9674698Z * [new branch] gh/williamwen42/340/base -> origin/gh/williamwen42/340/base 2025-12-04T12:53:07.9674776Z * [new branch] gh/williamwen42/340/head -> origin/gh/williamwen42/340/head 2025-12-04T12:53:07.9674856Z * [new branch] gh/williamwen42/340/orig -> origin/gh/williamwen42/340/orig 2025-12-04T12:53:07.9674933Z * [new branch] gh/williamwen42/341/base -> origin/gh/williamwen42/341/base 2025-12-04T12:53:07.9675009Z * [new branch] gh/williamwen42/341/head -> origin/gh/williamwen42/341/head 2025-12-04T12:53:07.9675086Z * [new branch] gh/williamwen42/341/orig -> origin/gh/williamwen42/341/orig 2025-12-04T12:53:07.9675162Z * [new branch] gh/williamwen42/342/base -> origin/gh/williamwen42/342/base 2025-12-04T12:53:07.9675239Z * [new branch] gh/williamwen42/342/head -> origin/gh/williamwen42/342/head 2025-12-04T12:53:07.9675346Z * [new branch] gh/williamwen42/342/orig -> origin/gh/williamwen42/342/orig 2025-12-04T12:53:07.9675422Z * [new branch] gh/williamwen42/343/base -> origin/gh/williamwen42/343/base 2025-12-04T12:53:07.9675498Z * [new branch] gh/williamwen42/343/head -> origin/gh/williamwen42/343/head 2025-12-04T12:53:07.9675579Z * [new branch] gh/williamwen42/343/orig -> origin/gh/williamwen42/343/orig 2025-12-04T12:53:07.9675655Z * [new branch] gh/williamwen42/344/base -> origin/gh/williamwen42/344/base 2025-12-04T12:53:07.9675731Z * [new branch] gh/williamwen42/344/head -> origin/gh/williamwen42/344/head 2025-12-04T12:53:07.9675808Z * [new branch] gh/williamwen42/344/orig -> origin/gh/williamwen42/344/orig 2025-12-04T12:53:07.9675884Z * [new branch] gh/williamwen42/345/base -> origin/gh/williamwen42/345/base 2025-12-04T12:53:07.9675962Z * [new branch] gh/williamwen42/345/head -> origin/gh/williamwen42/345/head 2025-12-04T12:53:07.9676038Z * [new branch] gh/williamwen42/345/orig -> origin/gh/williamwen42/345/orig 2025-12-04T12:53:07.9676116Z * [new branch] gh/williamwen42/346/base -> origin/gh/williamwen42/346/base 2025-12-04T12:53:07.9676193Z * [new branch] gh/williamwen42/346/head -> origin/gh/williamwen42/346/head 2025-12-04T12:53:07.9676301Z * [new branch] gh/williamwen42/346/orig -> origin/gh/williamwen42/346/orig 2025-12-04T12:53:07.9676378Z * [new branch] gh/williamwen42/347/base -> origin/gh/williamwen42/347/base 2025-12-04T12:53:07.9676457Z * [new branch] gh/williamwen42/347/head -> origin/gh/williamwen42/347/head 2025-12-04T12:53:07.9676535Z * [new branch] gh/williamwen42/347/orig -> origin/gh/williamwen42/347/orig 2025-12-04T12:53:07.9676610Z * [new branch] gh/williamwen42/348/base -> origin/gh/williamwen42/348/base 2025-12-04T12:53:07.9676690Z * [new branch] gh/williamwen42/348/head -> origin/gh/williamwen42/348/head 2025-12-04T12:53:07.9676767Z * [new branch] gh/williamwen42/348/orig -> origin/gh/williamwen42/348/orig 2025-12-04T12:53:07.9676843Z * [new branch] gh/williamwen42/349/base -> origin/gh/williamwen42/349/base 2025-12-04T12:53:07.9676923Z * [new branch] gh/williamwen42/349/head -> origin/gh/williamwen42/349/head 2025-12-04T12:53:07.9676998Z * [new branch] gh/williamwen42/349/orig -> origin/gh/williamwen42/349/orig 2025-12-04T12:53:07.9677074Z * [new branch] gh/williamwen42/350/base -> origin/gh/williamwen42/350/base 2025-12-04T12:53:07.9677151Z * [new branch] gh/williamwen42/350/head -> origin/gh/williamwen42/350/head 2025-12-04T12:53:07.9677231Z * [new branch] gh/williamwen42/350/orig -> origin/gh/williamwen42/350/orig 2025-12-04T12:53:07.9677313Z * [new branch] gh/williamwen42/351/base -> origin/gh/williamwen42/351/base 2025-12-04T12:53:07.9677389Z * [new branch] gh/williamwen42/351/head -> origin/gh/williamwen42/351/head 2025-12-04T12:53:07.9677465Z * [new branch] gh/williamwen42/351/orig -> origin/gh/williamwen42/351/orig 2025-12-04T12:53:07.9677542Z * [new branch] gh/williamwen42/352/base -> origin/gh/williamwen42/352/base 2025-12-04T12:53:07.9677620Z * [new branch] gh/williamwen42/352/head -> origin/gh/williamwen42/352/head 2025-12-04T12:53:07.9677696Z * [new branch] gh/williamwen42/352/orig -> origin/gh/williamwen42/352/orig 2025-12-04T12:53:07.9677773Z * [new branch] gh/williamwen42/353/base -> origin/gh/williamwen42/353/base 2025-12-04T12:53:07.9677848Z * [new branch] gh/williamwen42/353/head -> origin/gh/williamwen42/353/head 2025-12-04T12:53:07.9677924Z * [new branch] gh/williamwen42/353/orig -> origin/gh/williamwen42/353/orig 2025-12-04T12:53:07.9678028Z * [new branch] gh/williamwen42/354/base -> origin/gh/williamwen42/354/base 2025-12-04T12:53:07.9678104Z * [new branch] gh/williamwen42/354/head -> origin/gh/williamwen42/354/head 2025-12-04T12:53:07.9678181Z * [new branch] gh/williamwen42/354/orig -> origin/gh/williamwen42/354/orig 2025-12-04T12:53:07.9678260Z * [new branch] gh/williamwen42/355/base -> origin/gh/williamwen42/355/base 2025-12-04T12:53:07.9678338Z * [new branch] gh/williamwen42/355/head -> origin/gh/williamwen42/355/head 2025-12-04T12:53:07.9678415Z * [new branch] gh/williamwen42/355/orig -> origin/gh/williamwen42/355/orig 2025-12-04T12:53:07.9678493Z * [new branch] gh/williamwen42/356/base -> origin/gh/williamwen42/356/base 2025-12-04T12:53:07.9678568Z * [new branch] gh/williamwen42/356/head -> origin/gh/williamwen42/356/head 2025-12-04T12:53:07.9678647Z * [new branch] gh/williamwen42/356/orig -> origin/gh/williamwen42/356/orig 2025-12-04T12:53:07.9678724Z * [new branch] gh/williamwen42/357/base -> origin/gh/williamwen42/357/base 2025-12-04T12:53:07.9678799Z * [new branch] gh/williamwen42/357/head -> origin/gh/williamwen42/357/head 2025-12-04T12:53:07.9678876Z * [new branch] gh/williamwen42/357/orig -> origin/gh/williamwen42/357/orig 2025-12-04T12:53:07.9678977Z * [new branch] gh/williamwen42/358/base -> origin/gh/williamwen42/358/base 2025-12-04T12:53:07.9679054Z * [new branch] gh/williamwen42/358/head -> origin/gh/williamwen42/358/head 2025-12-04T12:53:07.9679130Z * [new branch] gh/williamwen42/358/orig -> origin/gh/williamwen42/358/orig 2025-12-04T12:53:07.9679198Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-12-04T12:53:07.9679267Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-12-04T12:53:07.9679335Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-12-04T12:53:07.9679403Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-12-04T12:53:07.9679468Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-12-04T12:53:07.9679536Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-12-04T12:53:07.9679604Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-12-04T12:53:07.9679671Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-12-04T12:53:07.9679738Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-12-04T12:53:07.9679803Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-12-04T12:53:07.9679868Z * [new branch] gh/xmfan/301/base -> origin/gh/xmfan/301/base 2025-12-04T12:53:07.9679935Z * [new branch] gh/xmfan/301/head -> origin/gh/xmfan/301/head 2025-12-04T12:53:07.9680001Z * [new branch] gh/xmfan/301/orig -> origin/gh/xmfan/301/orig 2025-12-04T12:53:07.9680067Z * [new branch] gh/xmfan/304/base -> origin/gh/xmfan/304/base 2025-12-04T12:53:07.9680132Z * [new branch] gh/xmfan/304/head -> origin/gh/xmfan/304/head 2025-12-04T12:53:07.9680238Z * [new branch] gh/xmfan/304/orig -> origin/gh/xmfan/304/orig 2025-12-04T12:53:07.9680304Z * [new branch] gh/xmfan/309/base -> origin/gh/xmfan/309/base 2025-12-04T12:53:07.9680370Z * [new branch] gh/xmfan/309/head -> origin/gh/xmfan/309/head 2025-12-04T12:53:07.9680435Z * [new branch] gh/xmfan/309/orig -> origin/gh/xmfan/309/orig 2025-12-04T12:53:07.9680500Z * [new branch] gh/xmfan/310/base -> origin/gh/xmfan/310/base 2025-12-04T12:53:07.9680612Z * [new branch] gh/xmfan/310/head -> origin/gh/xmfan/310/head 2025-12-04T12:53:07.9680677Z * [new branch] gh/xmfan/310/orig -> origin/gh/xmfan/310/orig 2025-12-04T12:53:07.9680743Z * [new branch] gh/xmfan/311/base -> origin/gh/xmfan/311/base 2025-12-04T12:53:07.9680808Z * [new branch] gh/xmfan/311/head -> origin/gh/xmfan/311/head 2025-12-04T12:53:07.9680874Z * [new branch] gh/xmfan/311/orig -> origin/gh/xmfan/311/orig 2025-12-04T12:53:07.9680940Z * [new branch] gh/xmfan/312/base -> origin/gh/xmfan/312/base 2025-12-04T12:53:07.9681006Z * [new branch] gh/xmfan/312/head -> origin/gh/xmfan/312/head 2025-12-04T12:53:07.9681073Z * [new branch] gh/xmfan/312/orig -> origin/gh/xmfan/312/orig 2025-12-04T12:53:07.9681140Z * [new branch] gh/xmfan/313/base -> origin/gh/xmfan/313/base 2025-12-04T12:53:07.9681205Z * [new branch] gh/xmfan/313/head -> origin/gh/xmfan/313/head 2025-12-04T12:53:07.9681271Z * [new branch] gh/xmfan/313/orig -> origin/gh/xmfan/313/orig 2025-12-04T12:53:07.9681351Z * [new branch] gh/xuanzhang816/27/base -> origin/gh/xuanzhang816/27/base 2025-12-04T12:53:07.9681427Z * [new branch] gh/xuanzhang816/27/head -> origin/gh/xuanzhang816/27/head 2025-12-04T12:53:07.9681549Z * [new branch] gh/xuanzhang816/27/orig -> origin/gh/xuanzhang816/27/orig 2025-12-04T12:53:07.9681627Z * [new branch] gh/xuanzhang816/32/base -> origin/gh/xuanzhang816/32/base 2025-12-04T12:53:07.9681700Z * [new branch] gh/xuanzhang816/32/head -> origin/gh/xuanzhang816/32/head 2025-12-04T12:53:07.9681775Z * [new branch] gh/xuanzhang816/32/orig -> origin/gh/xuanzhang816/32/orig 2025-12-04T12:53:07.9681849Z * [new branch] gh/xuanzhang816/33/base -> origin/gh/xuanzhang816/33/base 2025-12-04T12:53:07.9681924Z * [new branch] gh/xuanzhang816/33/head -> origin/gh/xuanzhang816/33/head 2025-12-04T12:53:07.9681997Z * [new branch] gh/xuanzhang816/33/orig -> origin/gh/xuanzhang816/33/orig 2025-12-04T12:53:07.9682071Z * [new branch] gh/xuanzhang816/34/base -> origin/gh/xuanzhang816/34/base 2025-12-04T12:53:07.9682145Z * [new branch] gh/xuanzhang816/34/head -> origin/gh/xuanzhang816/34/head 2025-12-04T12:53:07.9682220Z * [new branch] gh/xuanzhang816/34/orig -> origin/gh/xuanzhang816/34/orig 2025-12-04T12:53:07.9682293Z * [new branch] gh/xuanzhang816/35/base -> origin/gh/xuanzhang816/35/base 2025-12-04T12:53:07.9682367Z * [new branch] gh/xuanzhang816/35/head -> origin/gh/xuanzhang816/35/head 2025-12-04T12:53:07.9682441Z * [new branch] gh/xuanzhang816/35/orig -> origin/gh/xuanzhang816/35/orig 2025-12-04T12:53:07.9682514Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-12-04T12:53:07.9682586Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-12-04T12:53:07.9682658Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-12-04T12:53:07.9682728Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-12-04T12:53:07.9682800Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-12-04T12:53:07.9682869Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-12-04T12:53:07.9682938Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-12-04T12:53:07.9683006Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-12-04T12:53:07.9683077Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-12-04T12:53:07.9683145Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-12-04T12:53:07.9683280Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-12-04T12:53:07.9683350Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-12-04T12:53:07.9683418Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-12-04T12:53:07.9683489Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-12-04T12:53:07.9683560Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-12-04T12:53:07.9683630Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-12-04T12:53:07.9683701Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-12-04T12:53:07.9683769Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-12-04T12:53:07.9683837Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-12-04T12:53:07.9683908Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-12-04T12:53:07.9683976Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-12-04T12:53:07.9684044Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-12-04T12:53:07.9684138Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-12-04T12:53:07.9684208Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-12-04T12:53:07.9684279Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-12-04T12:53:07.9684351Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-12-04T12:53:07.9684419Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-12-04T12:53:07.9684488Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-12-04T12:53:07.9684559Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-12-04T12:53:07.9684627Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-12-04T12:53:07.9684695Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-12-04T12:53:07.9684766Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-12-04T12:53:07.9684835Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-12-04T12:53:07.9684905Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-12-04T12:53:07.9684974Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-12-04T12:53:07.9685044Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-12-04T12:53:07.9685115Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-12-04T12:53:07.9685184Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-12-04T12:53:07.9685252Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-12-04T12:53:07.9685322Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-12-04T12:53:07.9685391Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-12-04T12:53:07.9685468Z * [new branch] gh/yang-yu-hang/1/base -> origin/gh/yang-yu-hang/1/base 2025-12-04T12:53:07.9685543Z * [new branch] gh/yang-yu-hang/1/head -> origin/gh/yang-yu-hang/1/head 2025-12-04T12:53:07.9685616Z * [new branch] gh/yang-yu-hang/1/orig -> origin/gh/yang-yu-hang/1/orig 2025-12-04T12:53:07.9685688Z * [new branch] gh/yang-yu-hang/2/base -> origin/gh/yang-yu-hang/2/base 2025-12-04T12:53:07.9685785Z * [new branch] gh/yang-yu-hang/2/head -> origin/gh/yang-yu-hang/2/head 2025-12-04T12:53:07.9685857Z * [new branch] gh/yang-yu-hang/2/orig -> origin/gh/yang-yu-hang/2/orig 2025-12-04T12:53:07.9685927Z * [new branch] gh/yang-yu-hang/3/base -> origin/gh/yang-yu-hang/3/base 2025-12-04T12:53:07.9685999Z * [new branch] gh/yang-yu-hang/3/head -> origin/gh/yang-yu-hang/3/head 2025-12-04T12:53:07.9686070Z * [new branch] gh/yang-yu-hang/3/orig -> origin/gh/yang-yu-hang/3/orig 2025-12-04T12:53:07.9686142Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-12-04T12:53:07.9686216Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-12-04T12:53:07.9686286Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-12-04T12:53:07.9686356Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-12-04T12:53:07.9686428Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-12-04T12:53:07.9686497Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-12-04T12:53:07.9686567Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-12-04T12:53:07.9686637Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-12-04T12:53:07.9686751Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-12-04T12:53:07.9686821Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-12-04T12:53:07.9686891Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-12-04T12:53:07.9686960Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-12-04T12:53:07.9687029Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-12-04T12:53:07.9687100Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-12-04T12:53:07.9687169Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-12-04T12:53:07.9687239Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-12-04T12:53:07.9687307Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-12-04T12:53:07.9687380Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-12-04T12:53:07.9687451Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-12-04T12:53:07.9687521Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-12-04T12:53:07.9687590Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-12-04T12:53:07.9687659Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-12-04T12:53:07.9687727Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-12-04T12:53:07.9687792Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-12-04T12:53:07.9687858Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-12-04T12:53:07.9687922Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-12-04T12:53:07.9687990Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-12-04T12:53:07.9688058Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-12-04T12:53:07.9688123Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-12-04T12:53:07.9688188Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-12-04T12:53:07.9688253Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-12-04T12:53:07.9688345Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-12-04T12:53:07.9688498Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-12-04T12:53:07.9688563Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-12-04T12:53:07.9688630Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-12-04T12:53:07.9688697Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-12-04T12:53:07.9688762Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-12-04T12:53:07.9688826Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-12-04T12:53:07.9688892Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-12-04T12:53:07.9688956Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-12-04T12:53:07.9689022Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-12-04T12:53:07.9689088Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-12-04T12:53:07.9689153Z * [new branch] gh/ydwu4/327/base -> origin/gh/ydwu4/327/base 2025-12-04T12:53:07.9689217Z * [new branch] gh/ydwu4/327/head -> origin/gh/ydwu4/327/head 2025-12-04T12:53:07.9689327Z * [new branch] gh/ydwu4/327/orig -> origin/gh/ydwu4/327/orig 2025-12-04T12:53:07.9689392Z * [new branch] gh/ydwu4/328/base -> origin/gh/ydwu4/328/base 2025-12-04T12:53:07.9689456Z * [new branch] gh/ydwu4/328/head -> origin/gh/ydwu4/328/head 2025-12-04T12:53:07.9689524Z * [new branch] gh/ydwu4/328/orig -> origin/gh/ydwu4/328/orig 2025-12-04T12:53:07.9689588Z * [new branch] gh/ydwu4/329/base -> origin/gh/ydwu4/329/base 2025-12-04T12:53:07.9689652Z * [new branch] gh/ydwu4/329/head -> origin/gh/ydwu4/329/head 2025-12-04T12:53:07.9689719Z * [new branch] gh/ydwu4/329/orig -> origin/gh/ydwu4/329/orig 2025-12-04T12:53:07.9689783Z * [new branch] gh/ydwu4/330/base -> origin/gh/ydwu4/330/base 2025-12-04T12:53:07.9689848Z * [new branch] gh/ydwu4/330/head -> origin/gh/ydwu4/330/head 2025-12-04T12:53:07.9689914Z * [new branch] gh/ydwu4/330/orig -> origin/gh/ydwu4/330/orig 2025-12-04T12:53:07.9689979Z * [new branch] gh/ydwu4/331/base -> origin/gh/ydwu4/331/base 2025-12-04T12:53:07.9690047Z * [new branch] gh/ydwu4/331/head -> origin/gh/ydwu4/331/head 2025-12-04T12:53:07.9690111Z * [new branch] gh/ydwu4/331/orig -> origin/gh/ydwu4/331/orig 2025-12-04T12:53:07.9690213Z * [new branch] gh/ydwu4/332/base -> origin/gh/ydwu4/332/base 2025-12-04T12:53:07.9690281Z * [new branch] gh/ydwu4/332/head -> origin/gh/ydwu4/332/head 2025-12-04T12:53:07.9690350Z * [new branch] gh/ydwu4/332/orig -> origin/gh/ydwu4/332/orig 2025-12-04T12:53:07.9690414Z * [new branch] gh/ydwu4/333/base -> origin/gh/ydwu4/333/base 2025-12-04T12:53:07.9690479Z * [new branch] gh/ydwu4/333/head -> origin/gh/ydwu4/333/head 2025-12-04T12:53:07.9690546Z * [new branch] gh/ydwu4/333/orig -> origin/gh/ydwu4/333/orig 2025-12-04T12:53:07.9690611Z * [new branch] gh/ydwu4/334/base -> origin/gh/ydwu4/334/base 2025-12-04T12:53:07.9690677Z * [new branch] gh/ydwu4/334/head -> origin/gh/ydwu4/334/head 2025-12-04T12:53:07.9690742Z * [new branch] gh/ydwu4/334/orig -> origin/gh/ydwu4/334/orig 2025-12-04T12:53:07.9690807Z * [new branch] gh/ydwu4/335/base -> origin/gh/ydwu4/335/base 2025-12-04T12:53:07.9690873Z * [new branch] gh/ydwu4/335/head -> origin/gh/ydwu4/335/head 2025-12-04T12:53:07.9690984Z * [new branch] gh/ydwu4/335/orig -> origin/gh/ydwu4/335/orig 2025-12-04T12:53:07.9691048Z * [new branch] gh/ydwu4/337/base -> origin/gh/ydwu4/337/base 2025-12-04T12:53:07.9691114Z * [new branch] gh/ydwu4/337/head -> origin/gh/ydwu4/337/head 2025-12-04T12:53:07.9691180Z * [new branch] gh/ydwu4/337/orig -> origin/gh/ydwu4/337/orig 2025-12-04T12:53:07.9691245Z * [new branch] gh/ydwu4/339/base -> origin/gh/ydwu4/339/base 2025-12-04T12:53:07.9691312Z * [new branch] gh/ydwu4/339/head -> origin/gh/ydwu4/339/head 2025-12-04T12:53:07.9691377Z * [new branch] gh/ydwu4/339/orig -> origin/gh/ydwu4/339/orig 2025-12-04T12:53:07.9691440Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-12-04T12:53:07.9691505Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-12-04T12:53:07.9691572Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-12-04T12:53:07.9691638Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-12-04T12:53:07.9691710Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-12-04T12:53:07.9691833Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-12-04T12:53:07.9691906Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-12-04T12:53:07.9691978Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-12-04T12:53:07.9692050Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-12-04T12:53:07.9692122Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-12-04T12:53:07.9692193Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-12-04T12:53:07.9692264Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-12-04T12:53:07.9692334Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-12-04T12:53:07.9692403Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-12-04T12:53:07.9692477Z * [new branch] gh/yushangdi/1/base -> origin/gh/yushangdi/1/base 2025-12-04T12:53:07.9692549Z * [new branch] gh/yushangdi/1/head -> origin/gh/yushangdi/1/head 2025-12-04T12:53:07.9692619Z * [new branch] gh/yushangdi/10/base -> origin/gh/yushangdi/10/base 2025-12-04T12:53:07.9692690Z * [new branch] gh/yushangdi/10/head -> origin/gh/yushangdi/10/head 2025-12-04T12:53:07.9692761Z * [new branch] gh/yushangdi/10/orig -> origin/gh/yushangdi/10/orig 2025-12-04T12:53:07.9692831Z * [new branch] gh/yushangdi/11/base -> origin/gh/yushangdi/11/base 2025-12-04T12:53:07.9692902Z * [new branch] gh/yushangdi/11/head -> origin/gh/yushangdi/11/head 2025-12-04T12:53:07.9692974Z * [new branch] gh/yushangdi/11/orig -> origin/gh/yushangdi/11/orig 2025-12-04T12:53:07.9693045Z * [new branch] gh/yushangdi/2/base -> origin/gh/yushangdi/2/base 2025-12-04T12:53:07.9693119Z * [new branch] gh/yushangdi/2/head -> origin/gh/yushangdi/2/head 2025-12-04T12:53:07.9693188Z * [new branch] gh/yushangdi/7/base -> origin/gh/yushangdi/7/base 2025-12-04T12:53:07.9693257Z * [new branch] gh/yushangdi/7/head -> origin/gh/yushangdi/7/head 2025-12-04T12:53:07.9693326Z * [new branch] gh/yushangdi/7/orig -> origin/gh/yushangdi/7/orig 2025-12-04T12:53:07.9693396Z * [new branch] gh/yushangdi/8/base -> origin/gh/yushangdi/8/base 2025-12-04T12:53:07.9693467Z * [new branch] gh/yushangdi/8/head -> origin/gh/yushangdi/8/head 2025-12-04T12:53:07.9693567Z * [new branch] gh/yushangdi/8/orig -> origin/gh/yushangdi/8/orig 2025-12-04T12:53:07.9693636Z * [new branch] gh/yushangdi/9/base -> origin/gh/yushangdi/9/base 2025-12-04T12:53:07.9693706Z * [new branch] gh/yushangdi/9/head -> origin/gh/yushangdi/9/head 2025-12-04T12:53:07.9693779Z * [new branch] gh/yushangdi/9/orig -> origin/gh/yushangdi/9/orig 2025-12-04T12:53:07.9693846Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-12-04T12:53:07.9693913Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-12-04T12:53:07.9693980Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-12-04T12:53:07.9694046Z * [new branch] gh/zklaus/20/base -> origin/gh/zklaus/20/base 2025-12-04T12:53:07.9694112Z * [new branch] gh/zklaus/20/head -> origin/gh/zklaus/20/head 2025-12-04T12:53:07.9694181Z * [new branch] gh/zklaus/20/orig -> origin/gh/zklaus/20/orig 2025-12-04T12:53:07.9694246Z * [new branch] gh/zklaus/21/base -> origin/gh/zklaus/21/base 2025-12-04T12:53:07.9694312Z * [new branch] gh/zklaus/21/head -> origin/gh/zklaus/21/head 2025-12-04T12:53:07.9694412Z * [new branch] gh/zklaus/21/orig -> origin/gh/zklaus/21/orig 2025-12-04T12:53:07.9694478Z * [new branch] gh/zklaus/22/base -> origin/gh/zklaus/22/base 2025-12-04T12:53:07.9694543Z * [new branch] gh/zklaus/22/head -> origin/gh/zklaus/22/head 2025-12-04T12:53:07.9694610Z * [new branch] gh/zklaus/22/orig -> origin/gh/zklaus/22/orig 2025-12-04T12:53:07.9694675Z * [new branch] gh/zklaus/23/base -> origin/gh/zklaus/23/base 2025-12-04T12:53:07.9694741Z * [new branch] gh/zklaus/23/head -> origin/gh/zklaus/23/head 2025-12-04T12:53:07.9694808Z * [new branch] gh/zklaus/23/orig -> origin/gh/zklaus/23/orig 2025-12-04T12:53:07.9694874Z * [new branch] gh/zklaus/24/base -> origin/gh/zklaus/24/base 2025-12-04T12:53:07.9694939Z * [new branch] gh/zklaus/24/head -> origin/gh/zklaus/24/head 2025-12-04T12:53:07.9695006Z * [new branch] gh/zklaus/24/orig -> origin/gh/zklaus/24/orig 2025-12-04T12:53:07.9695076Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-12-04T12:53:07.9695145Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-12-04T12:53:07.9695213Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-12-04T12:53:07.9695280Z * [new branch] gh/zou3519/1199/base -> origin/gh/zou3519/1199/base 2025-12-04T12:53:07.9695350Z * [new branch] gh/zou3519/1199/head -> origin/gh/zou3519/1199/head 2025-12-04T12:53:07.9695418Z * [new branch] gh/zou3519/1199/orig -> origin/gh/zou3519/1199/orig 2025-12-04T12:53:07.9695485Z * [new branch] gh/zou3519/1200/base -> origin/gh/zou3519/1200/base 2025-12-04T12:53:07.9695556Z * [new branch] gh/zou3519/1200/head -> origin/gh/zou3519/1200/head 2025-12-04T12:53:07.9695624Z * [new branch] gh/zou3519/1200/orig -> origin/gh/zou3519/1200/orig 2025-12-04T12:53:07.9695692Z * [new branch] gh/zou3519/1201/base -> origin/gh/zou3519/1201/base 2025-12-04T12:53:07.9695761Z * [new branch] gh/zou3519/1201/head -> origin/gh/zou3519/1201/head 2025-12-04T12:53:07.9695830Z * [new branch] gh/zou3519/1201/orig -> origin/gh/zou3519/1201/orig 2025-12-04T12:53:07.9695897Z * [new branch] gh/zou3519/1202/base -> origin/gh/zou3519/1202/base 2025-12-04T12:53:07.9695965Z * [new branch] gh/zou3519/1202/head -> origin/gh/zou3519/1202/head 2025-12-04T12:53:07.9696057Z * [new branch] gh/zou3519/1202/orig -> origin/gh/zou3519/1202/orig 2025-12-04T12:53:07.9696125Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-12-04T12:53:07.9696192Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-12-04T12:53:07.9696260Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-12-04T12:53:07.9696328Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-12-04T12:53:07.9696394Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-12-04T12:53:07.9696460Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-12-04T12:53:07.9696526Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-12-04T12:53:07.9696592Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-12-04T12:53:07.9696658Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-12-04T12:53:07.9696724Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-12-04T12:53:07.9696792Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-12-04T12:53:07.9696857Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-12-04T12:53:07.9696950Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-12-04T12:53:07.9697019Z * [new branch] gh/zpcore/14/orig -> origin/gh/zpcore/14/orig 2025-12-04T12:53:07.9697085Z * [new branch] gh/zpcore/15/base -> origin/gh/zpcore/15/base 2025-12-04T12:53:07.9697153Z * [new branch] gh/zpcore/15/head -> origin/gh/zpcore/15/head 2025-12-04T12:53:07.9697219Z * [new branch] gh/zpcore/15/orig -> origin/gh/zpcore/15/orig 2025-12-04T12:53:07.9697287Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-12-04T12:53:07.9697356Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-12-04T12:53:07.9697423Z * [new branch] gh/zpcore/21/base -> origin/gh/zpcore/21/base 2025-12-04T12:53:07.9697490Z * [new branch] gh/zpcore/21/head -> origin/gh/zpcore/21/head 2025-12-04T12:53:07.9697558Z * [new branch] gh/zpcore/21/orig -> origin/gh/zpcore/21/orig 2025-12-04T12:53:07.9697623Z * [new branch] gh/zpcore/22/base -> origin/gh/zpcore/22/base 2025-12-04T12:53:07.9697690Z * [new branch] gh/zpcore/22/head -> origin/gh/zpcore/22/head 2025-12-04T12:53:07.9697758Z * [new branch] gh/zpcore/22/orig -> origin/gh/zpcore/22/orig 2025-12-04T12:53:07.9697824Z * [new branch] gh/zpcore/23/base -> origin/gh/zpcore/23/base 2025-12-04T12:53:07.9697892Z * [new branch] gh/zpcore/23/head -> origin/gh/zpcore/23/head 2025-12-04T12:53:07.9697957Z * [new branch] gh/zpcore/23/orig -> origin/gh/zpcore/23/orig 2025-12-04T12:53:07.9698024Z * [new branch] gh/zpcore/24/base -> origin/gh/zpcore/24/base 2025-12-04T12:53:07.9698092Z * [new branch] gh/zpcore/24/head -> origin/gh/zpcore/24/head 2025-12-04T12:53:07.9698161Z * [new branch] gh/zpcore/24/orig -> origin/gh/zpcore/24/orig 2025-12-04T12:53:07.9698228Z * [new branch] gh/zpcore/25/base -> origin/gh/zpcore/25/base 2025-12-04T12:53:07.9698294Z * [new branch] gh/zpcore/25/head -> origin/gh/zpcore/25/head 2025-12-04T12:53:07.9698361Z * [new branch] gh/zpcore/25/orig -> origin/gh/zpcore/25/orig 2025-12-04T12:53:07.9698429Z * [new branch] gh/zpcore/26/base -> origin/gh/zpcore/26/base 2025-12-04T12:53:07.9698523Z * [new branch] gh/zpcore/26/head -> origin/gh/zpcore/26/head 2025-12-04T12:53:07.9698589Z * [new branch] gh/zpcore/26/orig -> origin/gh/zpcore/26/orig 2025-12-04T12:53:07.9698656Z * [new branch] gh/zpcore/27/base -> origin/gh/zpcore/27/base 2025-12-04T12:53:07.9698726Z * [new branch] gh/zpcore/27/head -> origin/gh/zpcore/27/head 2025-12-04T12:53:07.9698796Z * [new branch] gh/zpcore/27/orig -> origin/gh/zpcore/27/orig 2025-12-04T12:53:07.9698862Z * [new branch] gh/zpcore/28/base -> origin/gh/zpcore/28/base 2025-12-04T12:53:07.9698930Z * [new branch] gh/zpcore/28/head -> origin/gh/zpcore/28/head 2025-12-04T12:53:07.9698997Z * [new branch] gh/zpcore/28/orig -> origin/gh/zpcore/28/orig 2025-12-04T12:53:07.9699065Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-12-04T12:53:07.9699133Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-12-04T12:53:07.9699201Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-12-04T12:53:07.9699266Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-12-04T12:53:07.9699334Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-12-04T12:53:07.9699430Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-12-04T12:53:07.9699501Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-12-04T12:53:07.9699568Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-12-04T12:53:07.9699635Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-12-04T12:53:07.9699703Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-12-04T12:53:07.9699769Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-12-04T12:53:07.9699837Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-12-04T12:53:07.9699907Z * [new branch] google-main -> origin/google-main 2025-12-04T12:53:07.9699991Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-12-04T12:53:07.9700062Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-12-04T12:53:07.9700232Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-12-04T12:53:07.9700350Z * [new branch] hameerabbasi/complex_tensor_subclass -> origin/hameerabbasi/complex_tensor_subclass 2025-12-04T12:53:07.9700487Z * [new branch] hameerabbasi/fix-ctensor-gradcheck-tests -> origin/hameerabbasi/fix-ctensor-gradcheck-tests 2025-12-04T12:53:07.9700598Z * [new branch] hameerabbasi/gradcheck-allclose -> origin/hameerabbasi/gradcheck-allclose 2025-12-04T12:53:07.9700665Z * [new branch] hc_baseline -> origin/hc_baseline 2025-12-04T12:53:07.9700726Z * [new branch] hhh_rand -> origin/hhh_rand 2025-12-04T12:53:07.9700789Z * [new branch] huba/f1 -> origin/huba/f1 2025-12-04T12:53:07.9700977Z * [new branch] increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test -> origin/increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test 2025-12-04T12:53:07.9701039Z * [new branch] inlining -> origin/inlining 2025-12-04T12:53:07.9701113Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-12-04T12:53:07.9701197Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-12-04T12:53:07.9701373Z * [new branch] instrument-trunk-pull-linux-with-job-test-filters -> origin/instrument-trunk-pull-linux-with-job-test-filters 2025-12-04T12:53:07.9701499Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-12-04T12:53:07.9701567Z * [new branch] issue#58739 -> origin/issue#58739 2025-12-04T12:53:07.9701648Z * [new branch] jainapurva-patch-1 -> origin/jainapurva-patch-1 2025-12-04T12:53:07.9701708Z * [new branch] jathu/o3 -> origin/jathu/o3 2025-12-04T12:53:07.9701770Z * [new branch] jathu/sve -> origin/jathu/sve 2025-12-04T12:53:07.9701895Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-12-04T12:53:07.9701999Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-12-04T12:53:07.9702109Z * [new branch] jiannanWang/memorysnapshot_filter -> origin/jiannanWang/memorysnapshot_filter 2025-12-04T12:53:07.9702220Z * [new branch] jiannanWang/profilerstepwarning -> origin/jiannanWang/profilerstepwarning 2025-12-04T12:53:07.9702305Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-12-04T12:53:07.9702387Z * [new branch] jithunnair-amd-patch-10 -> origin/jithunnair-amd-patch-10 2025-12-04T12:53:07.9702470Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-12-04T12:53:07.9702589Z * [new branch] jithunnair-amd-patch-3 -> origin/jithunnair-amd-patch-3 2025-12-04T12:53:07.9702670Z * [new branch] jithunnair-amd-patch-4 -> origin/jithunnair-amd-patch-4 2025-12-04T12:53:07.9702751Z * [new branch] jithunnair-amd-patch-5 -> origin/jithunnair-amd-patch-5 2025-12-04T12:53:07.9702830Z * [new branch] jithunnair-amd-patch-6 -> origin/jithunnair-amd-patch-6 2025-12-04T12:53:07.9702908Z * [new branch] jithunnair-amd-patch-7 -> origin/jithunnair-amd-patch-7 2025-12-04T12:53:07.9702987Z * [new branch] jithunnair-amd-patch-8 -> origin/jithunnair-amd-patch-8 2025-12-04T12:53:07.9703065Z * [new branch] jithunnair-amd-patch-9 -> origin/jithunnair-amd-patch-9 2025-12-04T12:53:07.9703143Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-12-04T12:53:07.9703214Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-12-04T12:53:07.9703279Z * [new branch] kainan_test -> origin/kainan_test 2025-12-04T12:53:07.9703356Z * [new branch] larryliu0820-patch-1 -> origin/larryliu0820-patch-1 2025-12-04T12:53:07.9703462Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-12-04T12:53:07.9703563Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-12-04T12:53:07.9703640Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-12-04T12:53:07.9703742Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-12-04T12:53:07.9703819Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-12-04T12:53:07.9703888Z * [new branch] llama4-stable -> origin/llama4-stable 2025-12-04T12:53:07.9703955Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-12-04T12:53:07.9704028Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-12-04T12:53:07.9704105Z * [new branch] lucaskabela/fix_164876 -> origin/lucaskabela/fix_164876 2025-12-04T12:53:07.9704185Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-12-04T12:53:07.9704278Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-12-04T12:53:07.9704384Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-12-04T12:53:07.9704533Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-12-04T12:53:07.9704647Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-12-04T12:53:07.9704778Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-12-04T12:53:07.9704856Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-12-04T12:53:07.9704948Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-12-04T12:53:07.9705045Z * [new branch] lucaskabela/typing_ctx_manager -> origin/lucaskabela/typing_ctx_manager 2025-12-04T12:53:07.9705139Z * [new branch] lucaskabela/typing_nn_module -> origin/lucaskabela/typing_nn_module 2025-12-04T12:53:07.9705242Z * [new branch] lucaskabela/typing_user_defined -> origin/lucaskabela/typing_user_defined 2025-12-04T12:53:07.9705335Z * [new branch] lucaskabela/typing_variables -> origin/lucaskabela/typing_variables 2025-12-04T12:53:07.9705443Z * [new branch] lucaskabela/typing_variables_dicts -> origin/lucaskabela/typing_variables_dicts 2025-12-04T12:53:07.9705582Z * [new branch] lucaskabela/typing_variables_functions -> origin/lucaskabela/typing_variables_functions 2025-12-04T12:53:07.9705688Z * [new branch] lucaskabela/typing_variables_lists -> origin/lucaskabela/typing_variables_lists 2025-12-04T12:53:07.9705759Z * [new branch] lw/torch_box_by_ref -> origin/lw/torch_box_by_ref 2025-12-04T12:53:07.9705820Z * [new branch] main -> origin/main 2025-12-04T12:53:07.9705889Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-12-04T12:53:07.9705960Z * [new branch] malfet-patch-2 -> origin/malfet-patch-2 2025-12-04T12:53:07.9706027Z * [new branch] malfet-patch-3 -> origin/malfet-patch-3 2025-12-04T12:53:07.9706093Z * [new branch] malfet-patch-4 -> origin/malfet-patch-4 2025-12-04T12:53:07.9706158Z * [new branch] malfet-patch-5 -> origin/malfet-patch-5 2025-12-04T12:53:07.9706223Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-12-04T12:53:07.9706287Z * [new branch] malfet-patch-7 -> origin/malfet-patch-7 2025-12-04T12:53:07.9706352Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-12-04T12:53:07.9706424Z * [new branch] malfet/add-3.14-ci -> origin/malfet/add-3.14-ci 2025-12-04T12:53:07.9706582Z * [new branch] malfet/be-do-not-make-typos-in-build-artifacts -> origin/malfet/be-do-not-make-typos-in-build-artifacts 2025-12-04T12:53:07.9706748Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-12-04T12:53:07.9706872Z * [new branch] malfet/be-remove-misisng-neon-headers -> origin/malfet/be-remove-misisng-neon-headers 2025-12-04T12:53:07.9706971Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-12-04T12:53:07.9707088Z * [new branch] manuel/aoti_metal_shimify-thread_safe -> origin/manuel/aoti_metal_shimify-thread_safe 2025-12-04T12:53:07.9707178Z * [new branch] manuel/inductor_link_openmp -> origin/manuel/inductor_link_openmp 2025-12-04T12:53:07.9707251Z * [new branch] masnesral/metaconda -> origin/masnesral/metaconda 2025-12-04T12:53:07.9707326Z * [new branch] mem_profiler_flaky_fix -> origin/mem_profiler_flaky_fix 2025-12-04T12:53:07.9707405Z * [new branch] mem_profiler_stack_trace -> origin/mem_profiler_stack_trace 2025-12-04T12:53:07.9707510Z * [new branch] memory_profiler_stack -> origin/memory_profiler_stack 2025-12-04T12:53:07.9707583Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-12-04T12:53:07.9707645Z * [new branch] mingw_posix -> origin/mingw_posix 2025-12-04T12:53:07.9707719Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-12-04T12:53:07.9707780Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-12-04T12:53:07.9707841Z * [new branch] mlazos/acts -> origin/mlazos/acts 2025-12-04T12:53:07.9707911Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-12-04T12:53:07.9707987Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-12-04T12:53:07.9708084Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-12-04T12:53:07.9708158Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-12-04T12:53:07.9708222Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-12-04T12:53:07.9708288Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-12-04T12:53:07.9708380Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-12-04T12:53:07.9708443Z * [new branch] mlazos/bwd -> origin/mlazos/bwd 2025-12-04T12:53:07.9708513Z * [new branch] mlazos/combo-test -> origin/mlazos/combo-test 2025-12-04T12:53:07.9708585Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-12-04T12:53:07.9708658Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-12-04T12:53:07.9708739Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-12-04T12:53:07.9708842Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-12-04T12:53:07.9708915Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-12-04T12:53:07.9708994Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-12-04T12:53:07.9709075Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-12-04T12:53:07.9709141Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-12-04T12:53:07.9709208Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-12-04T12:53:07.9709275Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-12-04T12:53:07.9709344Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-12-04T12:53:07.9709412Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-12-04T12:53:07.9709481Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-12-04T12:53:07.9709542Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-12-04T12:53:07.9709621Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-12-04T12:53:07.9709692Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-12-04T12:53:07.9709753Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-12-04T12:53:07.9709819Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-12-04T12:53:07.9709896Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-12-04T12:53:07.9709964Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-12-04T12:53:07.9710031Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-12-04T12:53:07.9710123Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-12-04T12:53:07.9710230Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-12-04T12:53:07.9710300Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-12-04T12:53:07.9710361Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-12-04T12:53:07.9710430Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-12-04T12:53:07.9710496Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-12-04T12:53:07.9710563Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-12-04T12:53:07.9710629Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-12-04T12:53:07.9710693Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-12-04T12:53:07.9710759Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-12-04T12:53:07.9710821Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-12-04T12:53:07.9710883Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-12-04T12:53:07.9710944Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-12-04T12:53:07.9711041Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-12-04T12:53:07.9711104Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-12-04T12:53:07.9711163Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-12-04T12:53:07.9711224Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-12-04T12:53:07.9711284Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-12-04T12:53:07.9711344Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-12-04T12:53:07.9711405Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-12-04T12:53:07.9711464Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-12-04T12:53:07.9711523Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-12-04T12:53:07.9711582Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-12-04T12:53:07.9711655Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-12-04T12:53:07.9711736Z * [new branch] mlazos/inductor-streams -> origin/mlazos/inductor-streams 2025-12-04T12:53:07.9711798Z * [new branch] mlazos/main -> origin/mlazos/main 2025-12-04T12:53:07.9711859Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-12-04T12:53:07.9711930Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-12-04T12:53:07.9712034Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-12-04T12:53:07.9712128Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-12-04T12:53:07.9712193Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-12-04T12:53:07.9712260Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-12-04T12:53:07.9712326Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-12-04T12:53:07.9712399Z * [new branch] mlazos/overguarding -> origin/mlazos/overguarding 2025-12-04T12:53:07.9712473Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-12-04T12:53:07.9712540Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-12-04T12:53:07.9712609Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-12-04T12:53:07.9712733Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-12-04T12:53:07.9712799Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-12-04T12:53:07.9712864Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-12-04T12:53:07.9712926Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-12-04T12:53:07.9713004Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-12-04T12:53:07.9713090Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-12-04T12:53:07.9713153Z * [new branch] mlazos/stests -> origin/mlazos/stests 2025-12-04T12:53:07.9713222Z * [new branch] mlazos/stream-ops -> origin/mlazos/stream-ops 2025-12-04T12:53:07.9713288Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-12-04T12:53:07.9713364Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-12-04T12:53:07.9713427Z * [new branch] mlazos/test -> origin/mlazos/test 2025-12-04T12:53:07.9713492Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-12-04T12:53:07.9713569Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-12-04T12:53:07.9713674Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-12-04T12:53:07.9713751Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-12-04T12:53:07.9713825Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-12-04T12:53:07.9713900Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-12-04T12:53:07.9713971Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-12-04T12:53:07.9714043Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-12-04T12:53:07.9714117Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-12-04T12:53:07.9714193Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-12-04T12:53:07.9714272Z * [new branch] mlazos/user-stream-base -> origin/mlazos/user-stream-base 2025-12-04T12:53:07.9714345Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-12-04T12:53:07.9714436Z * [new branch] mlazos/user-streams-backup -> origin/mlazos/user-streams-backup 2025-12-04T12:53:07.9714529Z * [new branch] mlazos/user-streams-backup2 -> origin/mlazos/user-streams-backup2 2025-12-04T12:53:07.9714597Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-12-04T12:53:07.9714667Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-12-04T12:53:07.9714741Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-12-04T12:53:07.9714812Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-12-04T12:53:07.9714875Z * [new branch] module-shim -> origin/module-shim 2025-12-04T12:53:07.9714936Z * [new branch] move_config -> origin/move_config 2025-12-04T12:53:07.9715006Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-12-04T12:53:07.9715074Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-12-04T12:53:07.9715173Z * [new branch] mwizak/fix-triton-block-shape -> origin/mwizak/fix-triton-block-shape 2025-12-04T12:53:07.9715240Z * [new branch] my_varlen_backup -> origin/my_varlen_backup 2025-12-04T12:53:07.9715313Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-12-04T12:53:07.9715400Z * [new branch] new-codegen -> origin/new-codegen 2025-12-04T12:53:07.9715464Z * [new branch] newtest-base -> origin/newtest-base 2025-12-04T12:53:07.9715535Z * [new branch] ngimel/addmm_dtype -> origin/ngimel/addmm_dtype 2025-12-04T12:53:07.9715598Z * [new branch] ngimel/div_inv -> origin/ngimel/div_inv 2025-12-04T12:53:07.9715677Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-12-04T12:53:07.9715746Z * [new branch] ngimel/gather_grid -> origin/ngimel/gather_grid 2025-12-04T12:53:07.9715832Z * [new branch] ngimel/gather_grid_release -> origin/ngimel/gather_grid_release 2025-12-04T12:53:07.9715897Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-12-04T12:53:07.9715963Z * [new branch] ngimel/hostalloc -> origin/ngimel/hostalloc 2025-12-04T12:53:07.9716031Z * [new branch] ngimel/storage_id -> origin/ngimel/storage_id 2025-12-04T12:53:07.9716094Z * [new branch] nightly -> origin/nightly 2025-12-04T12:53:07.9716209Z * [new branch] nikitaved/addmm_1_rowcol_lt_path_check -> origin/nikitaved/addmm_1_rowcol_lt_path_check 2025-12-04T12:53:07.9716362Z * [new branch] nikitaved/addmm_epilogue_fusions_2d_bias -> origin/nikitaved/addmm_epilogue_fusions_2d_bias 2025-12-04T12:53:07.9716489Z * [new branch] nikitaved/addmm_epilogue_fusions_inductor -> origin/nikitaved/addmm_epilogue_fusions_inductor 2025-12-04T12:53:07.9716610Z * [new branch] nikitaved/addmm_epilogue_fusions_scratch -> origin/nikitaved/addmm_epilogue_fusions_scratch 2025-12-04T12:53:07.9716726Z * [new branch] nikitaved/grad_addmm_epilogue_fusions -> origin/nikitaved/grad_addmm_epilogue_fusions 2025-12-04T12:53:07.9716836Z * [new branch] nikitaved/simpler_can_use_32bit_index -> origin/nikitaved/simpler_can_use_32bit_index 2025-12-04T12:53:07.9716903Z * [new branch] nikitaved/test -> origin/nikitaved/test 2025-12-04T12:53:07.9717027Z * [new branch] nmacchioni-perf-test-async-autotune -> origin/nmacchioni-perf-test-async-autotune 2025-12-04T12:53:07.9717103Z * [new branch] no_distributed_log_spew -> origin/no_distributed_log_spew 2025-12-04T12:53:07.9717167Z * [new branch] nofun-hack -> origin/nofun-hack 2025-12-04T12:53:07.9717228Z * [new branch] norm_bench -> origin/norm_bench 2025-12-04T12:53:07.9717302Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-12-04T12:53:07.9717375Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-12-04T12:53:07.9717442Z * [new branch] optimizer_test -> origin/optimizer_test 2025-12-04T12:53:07.9717509Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-12-04T12:53:07.9717578Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-12-04T12:53:07.9717647Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-12-04T12:53:07.9717713Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-12-04T12:53:07.9717779Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-12-04T12:53:07.9717846Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-12-04T12:53:07.9717910Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-12-04T12:53:07.9717974Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-12-04T12:53:07.9718038Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-12-04T12:53:07.9718102Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-12-04T12:53:07.9718199Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-12-04T12:53:07.9718262Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-12-04T12:53:07.9718325Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-12-04T12:53:07.9718390Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-12-04T12:53:07.9718453Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-12-04T12:53:07.9718516Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-12-04T12:53:07.9718580Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-12-04T12:53:07.9718644Z * [new branch] orig/release/2.9 -> origin/orig/release/2.9 2025-12-04T12:53:07.9718728Z * [new branch] origin/gh/fxdawnn/1/base -> origin/origin/gh/fxdawnn/1/base 2025-12-04T12:53:07.9718812Z * [new branch] origin/gh/fxdawnn/1/orig -> origin/origin/gh/fxdawnn/1/orig 2025-12-04T12:53:07.9718891Z * [new branch] origin/gh/zpcore/14/orig -> origin/origin/gh/zpcore/14/orig 2025-12-04T12:53:07.9718959Z * [new branch] oulgen-patch-1 -> origin/oulgen-patch-1 2025-12-04T12:53:07.9719049Z * [new branch] oulgen-patch-2 -> origin/oulgen-patch-2 2025-12-04T12:53:07.9719114Z * [new branch] oulgen-patch-3 -> origin/oulgen-patch-3 2025-12-04T12:53:07.9719179Z * [new branch] oulgen-patch-4 -> origin/oulgen-patch-4 2025-12-04T12:53:07.9719246Z * [new branch] padded-tensor -> origin/padded-tensor 2025-12-04T12:53:07.9719308Z * [new branch] pca2 -> origin/pca2 2025-12-04T12:53:07.9719379Z * [new branch] per_channel_backup -> origin/per_channel_backup 2025-12-04T12:53:07.9719443Z * [new branch] perf_ops -> origin/perf_ops 2025-12-04T12:53:07.9719505Z * [new branch] perf_ops_2_9 -> origin/perf_ops_2_9 2025-12-04T12:53:07.9719576Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-12-04T12:53:07.9719662Z * [new branch] pianpwk/__draft_debug_mode -> origin/pianpwk/__draft_debug_mode 2025-12-04T12:53:07.9719771Z * [new branch] pianpwk/_debug_mode_for_triton_draft -> origin/pianpwk/_debug_mode_for_triton_draft 2025-12-04T12:53:07.9719872Z * [new branch] pianpwk/_debug_nn_module_compile -> origin/pianpwk/_debug_nn_module_compile 2025-12-04T12:53:07.9719955Z * [new branch] pianpwk/_draft_triton_11_3 -> origin/pianpwk/_draft_triton_11_3 2025-12-04T12:53:07.9720046Z * [new branch] pianpwk/_manual_bucket_draft -> origin/pianpwk/_manual_bucket_draft 2025-12-04T12:53:07.9720147Z * [new branch] pianpwk/_profile_w_dispatch_keys -> origin/pianpwk/_profile_w_dispatch_keys 2025-12-04T12:53:07.9720282Z * [new branch] pianpwk/_super_draft_debug_mode -> origin/pianpwk/_super_draft_debug_mode 2025-12-04T12:53:07.9720387Z * [new branch] pianpwk/_unbacked_local_shard_size -> origin/pianpwk/_unbacked_local_shard_size 2025-12-04T12:53:07.9720462Z * [new branch] pianpwk/anomaly_tb -> origin/pianpwk/anomaly_tb 2025-12-04T12:53:07.9720542Z * [new branch] pianpwk/auto_fx_annotate -> origin/pianpwk/auto_fx_annotate 2025-12-04T12:53:07.9720653Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-12-04T12:53:07.9720739Z * [new branch] pianpwk/bert_dynamic_perf -> origin/pianpwk/bert_dynamic_perf 2025-12-04T12:53:07.9720836Z * [new branch] pianpwk/debug_fwd_stack_traces -> origin/pianpwk/debug_fwd_stack_traces 2025-12-04T12:53:07.9720920Z * [new branch] pianpwk/debug_hash_tensor -> origin/pianpwk/debug_hash_tensor 2025-12-04T12:53:07.9721053Z * [new branch] pianpwk/debug_mode_annotate -> origin/pianpwk/debug_mode_annotate 2025-12-04T12:53:07.9721141Z * [new branch] pianpwk/debug_mode_defaults -> origin/pianpwk/debug_mode_defaults 2025-12-04T12:53:07.9721223Z * [new branch] pianpwk/debug_mode_hacks -> origin/pianpwk/debug_mode_hacks 2025-12-04T12:53:07.9721329Z * [new branch] pianpwk/debug_mode_opcall_refactor -> origin/pianpwk/debug_mode_opcall_refactor 2025-12-04T12:53:07.9721414Z * [new branch] pianpwk/debug_mode_show_ids -> origin/pianpwk/debug_mode_show_ids 2025-12-04T12:53:07.9721497Z * [new branch] pianpwk/debug_mode_triton -> origin/pianpwk/debug_mode_triton 2025-12-04T12:53:07.9721591Z * [new branch] pianpwk/debug_show_stack_trace -> origin/pianpwk/debug_show_stack_trace 2025-12-04T12:53:07.9721690Z * [new branch] pianpwk/debug_wait_on_collective -> origin/pianpwk/debug_wait_on_collective 2025-12-04T12:53:07.9721789Z * [new branch] pianpwk/debugmode_compile_tf -> origin/pianpwk/debugmode_compile_tf 2025-12-04T12:53:07.9721912Z * [new branch] pianpwk/dispatch_key_debugging_for_debug -> origin/pianpwk/dispatch_key_debugging_for_debug 2025-12-04T12:53:07.9722073Z * [new branch] pianpwk/draft_debug_mode_tfcompile -> origin/pianpwk/draft_debug_mode_tfcompile 2025-12-04T12:53:07.9722168Z * [new branch] pianpwk/draft_multikernel_nn -> origin/pianpwk/draft_multikernel_nn 2025-12-04T12:53:07.9722280Z * [new branch] pianpwk/draft_multikernel_status_10_5 -> origin/pianpwk/draft_multikernel_status_10_5 2025-12-04T12:53:07.9722371Z * [new branch] pianpwk/dtensor_custom_chunk -> origin/pianpwk/dtensor_custom_chunk 2025-12-04T12:53:07.9722475Z * [new branch] pianpwk/dtensor_unbacked_keypath -> origin/pianpwk/dtensor_unbacked_keypath 2025-12-04T12:53:07.9722554Z * [new branch] pianpwk/event_list_tree -> origin/pianpwk/event_list_tree 2025-12-04T12:53:07.9722635Z * [new branch] pianpwk/false_numel_refs -> origin/pianpwk/false_numel_refs 2025-12-04T12:53:07.9722713Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-12-04T12:53:07.9722816Z * [new branch] pianpwk/multikernel_hints_draft -> origin/pianpwk/multikernel_hints_draft 2025-12-04T12:53:07.9722925Z * [new branch] pianpwk/no_size_oblivious_slice_scat -> origin/pianpwk/no_size_oblivious_slice_scat 2025-12-04T12:53:07.9723038Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-12-04T12:53:07.9723120Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-12-04T12:53:07.9723226Z * [new branch] pianpwk/skip_python_keys_alternate -> origin/pianpwk/skip_python_keys_alternate 2025-12-04T12:53:07.9723330Z * [new branch] pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards 2025-12-04T12:53:07.9723409Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-12-04T12:53:07.9723488Z * [new branch] pianpwk/symint_one_hot -> origin/pianpwk/symint_one_hot 2025-12-04T12:53:07.9723601Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-12-04T12:53:07.9723697Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-12-04T12:53:07.9723777Z * [new branch] pianpwk/try_dumb_stuff -> origin/pianpwk/try_dumb_stuff 2025-12-04T12:53:07.9723854Z * [new branch] pianpwk/try_dumb_stuff_2 -> origin/pianpwk/try_dumb_stuff_2 2025-12-04T12:53:07.9723945Z * [new branch] pianpwk/unbacked_dtensor_mm -> origin/pianpwk/unbacked_dtensor_mm 2025-12-04T12:53:07.9724070Z * [new branch] pianpwk/unbacked_tracing_12_2 -> origin/pianpwk/unbacked_tracing_12_2 2025-12-04T12:53:07.9724145Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-12-04T12:53:07.9724222Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-12-04T12:53:07.9724316Z * [new branch] piz/fix_partial_backward_1112 -> origin/piz/fix_partial_backward_1112 2025-12-04T12:53:07.9724390Z * [new branch] piz/prop_cache_clean -> origin/piz/prop_cache_clean 2025-12-04T12:53:07.9724461Z * [new branch] pool-separate -> origin/pool-separate 2025-12-04T12:53:07.9724522Z * [new branch] pr-156087 -> origin/pr-156087 2025-12-04T12:53:07.9724581Z * [new branch] pr/131860 -> origin/pr/131860 2025-12-04T12:53:07.9724650Z * [new branch] predispatch_to -> origin/predispatch_to 2025-12-04T12:53:07.9724716Z * [new branch] protect-c17 -> origin/protect-c17 2025-12-04T12:53:07.9724782Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-12-04T12:53:07.9724863Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-12-04T12:53:07.9725015Z * [new branch] q1l1/fix_device_moved_constant_type_unknown -> origin/q1l1/fix_device_moved_constant_type_unknown 2025-12-04T12:53:07.9725153Z * [new branch] q1l1/fix_wrong_default_type_for_kernel_call_args -> origin/q1l1/fix_wrong_default_type_for_kernel_call_args 2025-12-04T12:53:07.9725233Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-12-04T12:53:07.9725305Z * [new branch] quote-pytest_cache -> origin/quote-pytest_cache 2025-12-04T12:53:07.9725400Z * [new branch] reland-accgrad-stream-warn -> origin/reland-accgrad-stream-warn 2025-12-04T12:53:07.9725466Z * [new branch] release/1.10 -> origin/release/1.10 2025-12-04T12:53:07.9725528Z * [new branch] release/1.11 -> origin/release/1.11 2025-12-04T12:53:07.9725592Z * [new branch] release/1.12 -> origin/release/1.12 2025-12-04T12:53:07.9725653Z * [new branch] release/1.13 -> origin/release/1.13 2025-12-04T12:53:07.9725715Z * [new branch] release/1.4 -> origin/release/1.4 2025-12-04T12:53:07.9725780Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-12-04T12:53:07.9725841Z * [new branch] release/1.5 -> origin/release/1.5 2025-12-04T12:53:07.9725901Z * [new branch] release/1.6 -> origin/release/1.6 2025-12-04T12:53:07.9725962Z * [new branch] release/1.7 -> origin/release/1.7 2025-12-04T12:53:07.9726022Z * [new branch] release/1.8 -> origin/release/1.8 2025-12-04T12:53:07.9726084Z * [new branch] release/1.9 -> origin/release/1.9 2025-12-04T12:53:07.9726146Z * [new branch] release/2.0 -> origin/release/2.0 2025-12-04T12:53:07.9726204Z * [new branch] release/2.1 -> origin/release/2.1 2025-12-04T12:53:07.9726264Z * [new branch] release/2.2 -> origin/release/2.2 2025-12-04T12:53:07.9726324Z * [new branch] release/2.3 -> origin/release/2.3 2025-12-04T12:53:07.9726382Z * [new branch] release/2.4 -> origin/release/2.4 2025-12-04T12:53:07.9726440Z * [new branch] release/2.5 -> origin/release/2.5 2025-12-04T12:53:07.9726500Z * [new branch] release/2.6 -> origin/release/2.6 2025-12-04T12:53:07.9726558Z * [new branch] release/2.7 -> origin/release/2.7 2025-12-04T12:53:07.9726650Z * [new branch] release/2.8 -> origin/release/2.8 2025-12-04T12:53:07.9726711Z * [new branch] release/2.9 -> origin/release/2.9 2025-12-04T12:53:07.9726774Z * [new branch] release_notes -> origin/release_notes 2025-12-04T12:53:07.9726848Z * [new branch] remove_pyinterpreter -> origin/remove_pyinterpreter 2025-12-04T12:53:07.9726972Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-12-04T12:53:07.9727090Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-12-04T12:53:07.9727205Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-12-04T12:53:07.9727323Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-12-04T12:53:07.9727453Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-12-04T12:53:07.9727584Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-12-04T12:53:07.9727686Z * [new branch] revert-152361-gh/fadara01/1/head -> origin/revert-152361-gh/fadara01/1/head 2025-12-04T12:53:07.9727811Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-12-04T12:53:07.9727981Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-12-04T12:53:07.9741971Z * [new branch] revert-hoo-invoke-subgraph -> origin/revert-hoo-invoke-subgraph 2025-12-04T12:53:07.9742097Z * [new branch] revert_always_build_distributed -> origin/revert_always_build_distributed 2025-12-04T12:53:07.9742172Z * [new branch] rms_norm_patch -> origin/rms_norm_patch 2025-12-04T12:53:07.9742275Z * [new branch] ruisi/fix_all_to_all_estimation -> origin/ruisi/fix_all_to_all_estimation 2025-12-04T12:53:07.9742361Z * [new branch] ruisi/fix_comm_estimation -> origin/ruisi/fix_comm_estimation 2025-12-04T12:53:07.9742469Z * [new branch] ruisi/fix_dynamic_shape_estimation -> origin/ruisi/fix_dynamic_shape_estimation 2025-12-04T12:53:07.9742572Z * [new branch] ruisi/fix_llama3_autobucketing -> origin/ruisi/fix_llama3_autobucketing 2025-12-04T12:53:07.9742677Z * [new branch] ruisi/fix_manual_bucketing_ep_pass -> origin/ruisi/fix_manual_bucketing_ep_pass 2025-12-04T12:53:07.9742762Z * [new branch] ruisi/manual_bucket_pass -> origin/ruisi/manual_bucket_pass 2025-12-04T12:53:07.9742911Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-12-04T12:53:07.9743002Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-12-04T12:53:07.9743080Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-12-04T12:53:07.9743145Z * [new branch] rzou/njt -> origin/rzou/njt 2025-12-04T12:53:07.9743221Z * [new branch] rzou/pca -> origin/rzou/pca 2025-12-04T12:53:07.9743304Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-12-04T12:53:07.9743372Z * [new branch] samplevllm -> origin/samplevllm 2025-12-04T12:53:07.9743543Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-12-04T12:53:07.9743639Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-12-04T12:53:07.9743754Z * [new branch] sapling-pr-archive-tushar00jain -> origin/sapling-pr-archive-tushar00jain 2025-12-04T12:53:07.9743882Z * [new branch] save -> origin/save 2025-12-04T12:53:07.9743949Z * [new branch] scaled_mm -> origin/scaled_mm 2025-12-04T12:53:07.9744015Z * [new branch] scan_attempt -> origin/scan_attempt 2025-12-04T12:53:07.9744080Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-12-04T12:53:07.9744190Z * [new branch] sekyondaMeta-dynamoconfig-fix -> origin/sekyondaMeta-dynamoconfig-fix 2025-12-04T12:53:07.9744269Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-12-04T12:53:07.9744349Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-12-04T12:53:07.9744430Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-12-04T12:53:07.9744512Z * [new branch] some_rocm_inductor_skips -> origin/some_rocm_inductor_skips 2025-12-04T12:53:07.9744598Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-12-04T12:53:07.9744685Z * [new branch] sparse-mm-bf16-support -> origin/sparse-mm-bf16-support 2025-12-04T12:53:07.9744761Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-12-04T12:53:07.9744861Z * [new branch] suo -> origin/suo 2025-12-04T12:53:07.9744930Z * [new branch] sve-poc -> origin/sve-poc 2025-12-04T12:53:07.9744994Z * [new branch] switch-bn -> origin/switch-bn 2025-12-04T12:53:07.9745087Z * [new branch] sy_annotation_in_autograd_hop -> origin/sy_annotation_in_autograd_hop 2025-12-04T12:53:07.9745161Z * [new branch] sy_aot_eager_record -> origin/sy_aot_eager_record 2025-12-04T12:53:07.9745232Z * [new branch] sy_custom_bucketing -> origin/sy_custom_bucketing 2025-12-04T12:53:07.9745304Z * [new branch] sy_debug_mode_test -> origin/sy_debug_mode_test 2025-12-04T12:53:07.9745375Z * [new branch] sy_deserialize -> origin/sy_deserialize 2025-12-04T12:53:07.9745445Z * [new branch] sy_dump_gm_code -> origin/sy_dump_gm_code 2025-12-04T12:53:07.9745511Z * [new branch] sy_exp -> origin/sy_exp 2025-12-04T12:53:07.9745587Z * [new branch] sy_export_annotation -> origin/sy_export_annotation 2025-12-04T12:53:07.9745658Z * [new branch] sy_invoke_subgraph -> origin/sy_invoke_subgraph 2025-12-04T12:53:07.9745731Z * [new branch] sy_kernel_bw_name -> origin/sy_kernel_bw_name 2025-12-04T12:53:07.9745795Z * [new branch] sy_multi_arch -> origin/sy_multi_arch 2025-12-04T12:53:07.9745864Z * [new branch] sy_nn_module_stack -> origin/sy_nn_module_stack 2025-12-04T12:53:07.9745943Z * [new branch] sy_original_dtensor -> origin/sy_original_dtensor 2025-12-04T12:53:07.9746011Z * [new branch] sy_profiler_cia -> origin/sy_profiler_cia 2025-12-04T12:53:07.9746075Z * [new branch] symm_mem_sync -> origin/symm_mem_sync 2025-12-04T12:53:07.9746166Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-12-04T12:53:07.9746246Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-12-04T12:53:07.9746329Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-12-04T12:53:07.9746395Z * [new branch] test-old -> origin/test-old 2025-12-04T12:53:07.9746463Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-12-04T12:53:07.9746562Z * [new branch] tianren/customOp_autotune_fix -> origin/tianren/customOp_autotune_fix 2025-12-04T12:53:07.9746721Z * [new branch] tianren/customOp_enable_max_autotune -> origin/tianren/customOp_enable_max_autotune 2025-12-04T12:53:07.9746808Z * [new branch] tianren/customOp_fusion -> origin/tianren/customOp_fusion 2025-12-04T12:53:07.9746938Z * [new branch] tianren/customop_collectiveop_benchmark -> origin/tianren/customop_collectiveop_benchmark 2025-12-04T12:53:07.9747082Z * [new branch] tianren/customop_collectiveop_benchmark_fix -> origin/tianren/customop_collectiveop_benchmark_fix 2025-12-04T12:53:07.9747188Z * [new branch] tianren/customop_dynamic_config -> origin/tianren/customop_dynamic_config 2025-12-04T12:53:07.9747285Z * [new branch] tianren/dynamic_range_input -> origin/tianren/dynamic_range_input 2025-12-04T12:53:07.9747388Z * [new branch] tianren/dynamic_range_input_fix -> origin/tianren/dynamic_range_input_fix 2025-12-04T12:53:07.9747496Z * [new branch] tianren/dynamic_range_input_merge -> origin/tianren/dynamic_range_input_merge 2025-12-04T12:53:07.9747601Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-12-04T12:53:07.9747687Z * [new branch] tianren/fx_codegen_dump -> origin/tianren/fx_codegen_dump 2025-12-04T12:53:07.9747777Z * [new branch] tianren/symmetric_memory -> origin/tianren/symmetric_memory 2025-12-04T12:53:07.9747889Z * [new branch] tianren/test -> origin/tianren/test 2025-12-04T12:53:07.9747975Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-12-04T12:53:07.9748061Z * [new branch] tmp -> origin/tmp 2025-12-04T12:53:07.9748132Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-12-04T12:53:07.9748217Z * [new branch] torchtitan_integration -> origin/torchtitan_integration 2025-12-04T12:53:07.9748309Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-12-04T12:53:07.9748398Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-12-04T12:53:07.9748470Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-12-04T12:53:07.9748537Z * [new branch] triton_kernel -> origin/triton_kernel 2025-12-04T12:53:07.9748606Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-12-04T12:53:07.9748669Z * [new branch] type_dec -> origin/type_dec 2025-12-04T12:53:07.9748764Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-12-04T12:53:07.9748909Z * [new branch] update-audio-commit-hash/17630256502-1803-1 -> origin/update-audio-commit-hash/17630256502-1803-1 2025-12-04T12:53:07.9749047Z * [new branch] update-audio-commit-hash/19087141161-1916-1 -> origin/update-audio-commit-hash/19087141161-1916-1 2025-12-04T12:53:07.9749186Z * [new branch] update-audio-commit-hash/19250643381-1929-1 -> origin/update-audio-commit-hash/19250643381-1929-1 2025-12-04T12:53:07.9749319Z * [new branch] update-audio-commit-hash/19397724337-1935-1 -> origin/update-audio-commit-hash/19397724337-1935-1 2025-12-04T12:53:07.9749452Z * [new branch] update-audio-commit-hash/19555670148-1941-1 -> origin/update-audio-commit-hash/19555670148-1941-1 2025-12-04T12:53:07.9749586Z * [new branch] update-audio-commit-hash/19750627930-1946-1 -> origin/update-audio-commit-hash/19750627930-1946-1 2025-12-04T12:53:07.9749726Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-12-04T12:53:07.9749878Z * [new branch] update-vision-commit-hash/19087141161-1916-1 -> origin/update-vision-commit-hash/19087141161-1916-1 2025-12-04T12:53:07.9750072Z * [new branch] update-vision-commit-hash/19184897099-1925-1 -> origin/update-vision-commit-hash/19184897099-1925-1 2025-12-04T12:53:07.9750299Z * [new branch] update-vision-commit-hash/19250643381-1929-1 -> origin/update-vision-commit-hash/19250643381-1929-1 2025-12-04T12:53:07.9750477Z * [new branch] update-vision-commit-hash/19381328640-1934-1 -> origin/update-vision-commit-hash/19381328640-1934-1 2025-12-04T12:53:07.9750615Z * [new branch] update-vision-commit-hash/19485237164-1938-1 -> origin/update-vision-commit-hash/19485237164-1938-1 2025-12-04T12:53:07.9750745Z * [new branch] update-vllm-commit-hash/18451675449-1879-1 -> origin/update-vllm-commit-hash/18451675449-1879-1 2025-12-04T12:53:07.9750833Z * [new branch] update-vllm-dockerfile -> origin/update-vllm-dockerfile 2025-12-04T12:53:07.9750959Z * [new branch] update-xla-commit-hash/19224287370-211-1 -> origin/update-xla-commit-hash/19224287370-211-1 2025-12-04T12:53:07.9751087Z * [new branch] update-xla-commit-hash/19422028566-212-1 -> origin/update-xla-commit-hash/19422028566-212-1 2025-12-04T12:53:07.9751210Z * [new branch] update-xla-commit-hash/19626841311-213-1 -> origin/update-xla-commit-hash/19626841311-213-1 2025-12-04T12:53:07.9751386Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-12-04T12:53:07.9751467Z * [new branch] update_operator_readme -> origin/update_operator_readme 2025-12-04T12:53:07.9751560Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-12-04T12:53:07.9751649Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-12-04T12:53:07.9751737Z * [new branch] update_slow_tests_1762155677 -> origin/update_slow_tests_1762155677 2025-12-04T12:53:07.9751828Z * [new branch] update_slow_tests_1763365283 -> origin/update_slow_tests_1763365283 2025-12-04T12:53:07.9751920Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-12-04T12:53:07.9772587Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-12-04T12:53:07.9772767Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-12-04T12:53:07.9772883Z * [new branch] upload-tests-for-autorevert -> origin/upload-tests-for-autorevert 2025-12-04T12:53:07.9772960Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-12-04T12:53:07.9773028Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-12-04T12:53:07.9773093Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-12-04T12:53:07.9773165Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-12-04T12:53:07.9773227Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-12-04T12:53:07.9773294Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-12-04T12:53:07.9773360Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-12-04T12:53:07.9773430Z * [new branch] validate_fn -> origin/validate_fn 2025-12-04T12:53:07.9773502Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-12-04T12:53:07.9773579Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-12-04T12:53:07.9773649Z * [new branch] varlen-api -> origin/varlen-api 2025-12-04T12:53:07.9773729Z * [new branch] varlen-api-backup -> origin/varlen-api-backup 2025-12-04T12:53:07.9773815Z * [new branch] varlen_batch_invariance -> origin/varlen_batch_invariance 2025-12-04T12:53:07.9773885Z * [new branch] viable/strict -> origin/viable/strict 2025-12-04T12:53:07.9774072Z * [new branch] vishal9-team/dtensor_parallelism_toy -> origin/vishal9-team/dtensor_parallelism_toy 2025-12-04T12:53:07.9774145Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-12-04T12:53:07.9774209Z * [new branch] vllmpin -> origin/vllmpin 2025-12-04T12:53:07.9774303Z * [new branch] vscode-recommend-pyrefly -> origin/vscode-recommend-pyrefly 2025-12-04T12:53:07.9774381Z * [new branch] wdvr-patch-1 -> origin/wdvr-patch-1 2025-12-04T12:53:07.9774451Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-12-04T12:53:07.9774522Z * [new branch] whc/pei -> origin/whc/pei 2025-12-04T12:53:07.9774591Z * [new branch] whc/pp_fix -> origin/whc/pp_fix 2025-12-04T12:53:07.9774661Z * [new branch] whc/sharding -> origin/whc/sharding 2025-12-04T12:53:07.9774737Z * [new branch] whc/sharding2 -> origin/whc/sharding2 2025-12-04T12:53:07.9774802Z * [new branch] whc/uneven -> origin/whc/uneven 2025-12-04T12:53:07.9774879Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-12-04T12:53:07.9774950Z * [new branch] win_warnings -> origin/win_warnings 2025-12-04T12:53:07.9775083Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-12-04T12:53:07.9775151Z * [new branch] xmfan-war -> origin/xmfan-war 2025-12-04T12:53:07.9775223Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-12-04T12:53:07.9775295Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-12-04T12:53:07.9775450Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-12-04T12:53:07.9775532Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-12-04T12:53:07.9775603Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-12-04T12:53:07.9775672Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-12-04T12:53:07.9775744Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-12-04T12:53:07.9775819Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-12-04T12:53:07.9775896Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-12-04T12:53:07.9775985Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-12-04T12:53:07.9776066Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-12-04T12:53:07.9776134Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-12-04T12:53:07.9776212Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-12-04T12:53:07.9776281Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-12-04T12:53:07.9776355Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-12-04T12:53:07.9776439Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-12-04T12:53:07.9776538Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-12-04T12:53:07.9776619Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-12-04T12:53:07.9776693Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-12-04T12:53:07.9776763Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-12-04T12:53:07.9776854Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-12-04T12:53:07.9776994Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-12-04T12:53:07.9777151Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T12:53:07.9777307Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T12:53:07.9777383Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-12-04T12:53:07.9777451Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-12-04T12:53:07.9777521Z * [new branch] xmfan/test -> origin/xmfan/test 2025-12-04T12:53:07.9777612Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-12-04T12:53:07.9777693Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-12-04T12:53:07.9777799Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-12-04T12:53:07.9777870Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-12-04T12:53:07.9777975Z * [new branch] yiming/run_with_start_end_rng_hop -> origin/yiming/run_with_start_end_rng_hop 2025-12-04T12:53:07.9778077Z * [new branch] yolo-llama3 -> origin/yolo-llama3 2025-12-04T12:53:07.9778151Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-12-04T12:53:07.9778247Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-12-04T12:53:07.9778330Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-12-04T12:53:07.9778397Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-12-04T12:53:07.9778479Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-12-04T12:53:07.9778544Z * [new branch] zb2p -> origin/zb2p 2025-12-04T12:53:07.9778631Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-12-04T12:53:07.9778732Z * [new branch] zhxchen17/ci/vllm_lora_oom -> origin/zhxchen17/ci/vllm_lora_oom 2025-12-04T12:53:07.9778843Z * [new branch] zhxchen17/ci/vllm_multimodal_oom -> origin/zhxchen17/ci/vllm_multimodal_oom 2025-12-04T12:53:07.9778923Z * [new branch] zhxchen17/ci/vllm_pin -> origin/zhxchen17/ci/vllm_pin 2025-12-04T12:53:07.9779054Z * [new branch] zhxchen17/dynamo/unsafe_drop_all_guards -> origin/zhxchen17/dynamo/unsafe_drop_all_guards 2025-12-04T12:53:07.9779154Z * [new branch] zhxchen17/export/call_override -> origin/zhxchen17/export/call_override 2025-12-04T12:53:07.9779244Z * [new branch] zhxchen17/export/codemod1 -> origin/zhxchen17/export/codemod1 2025-12-04T12:53:07.9779344Z * [new branch] zhxchen17/export/ctx_return -> origin/zhxchen17/export/ctx_return 2025-12-04T12:53:07.9779475Z * [new branch] zhxchen17/export/disable_side_effect_warn -> origin/zhxchen17/export/disable_side_effect_warn 2025-12-04T12:53:07.9779578Z * [new branch] zhxchen17/export/pytree_check -> origin/zhxchen17/export/pytree_check 2025-12-04T12:53:07.9779677Z * [new branch] zhxchen17/precompile/aoti -> origin/zhxchen17/precompile/aoti 2025-12-04T12:53:07.9779776Z * [new branch] zhxchen17/precompile/globals -> origin/zhxchen17/precompile/globals 2025-12-04T12:53:07.9779900Z * [new branch] zhxchen17/precompile/inductor_guards -> origin/zhxchen17/precompile/inductor_guards 2025-12-04T12:53:07.9779980Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-12-04T12:53:07.9780090Z * [new branch] zhxchen17/torch_export_api_update -> origin/zhxchen17/torch_export_api_update 2025-12-04T12:53:07.9780253Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-12-04T12:53:07.9780338Z * [new branch] zxiiro/build-times -> origin/zxiiro/build-times 2025-12-04T12:53:07.9780415Z * [new branch] zxiiro/c7i.2xlarge -> origin/zxiiro/c7i.2xlarge 2025-12-04T12:53:07.9780510Z * [new branch] zxiiro/c7i.2xlarge.h100 -> origin/zxiiro/c7i.2xlarge.h100 2025-12-04T12:53:07.9780581Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-12-04T12:53:07.9780651Z * [new branch] zxiiro/risc64 -> origin/zxiiro/risc64 2025-12-04T12:53:07.9780751Z * [new branch] zxiiro/test-multicloud-arc -> origin/zxiiro/test-multicloud-arc 2025-12-04T12:53:07.9780819Z t [tag update] ciflow/b200/115316 -> ciflow/b200/115316 2025-12-04T12:53:07.9780883Z * [new tag] ciflow/dynamo/169437 -> ciflow/dynamo/169437 2025-12-04T12:53:07.9780956Z t [tag update] ciflow/h100/115316 -> ciflow/h100/115316 2025-12-04T12:53:07.9781119Z * [new tag] ciflow/inductor-perf-test-nightly-rocm-mi300/169566 -> ciflow/inductor-perf-test-nightly-rocm-mi300/169566 2025-12-04T12:53:07.9781190Z t [tag update] ciflow/inductor/169437 -> ciflow/inductor/169437 2025-12-04T12:53:07.9781299Z * [new tag] ciflow/inductor/169564 -> ciflow/inductor/169564 2025-12-04T12:53:07.9781364Z * [new tag] ciflow/inductor/169566 -> ciflow/inductor/169566 2025-12-04T12:53:07.9781429Z t [tag update] ciflow/rocm/115316 -> ciflow/rocm/115316 2025-12-04T12:53:07.9781496Z * [new tag] ciflow/rocm/169564 -> ciflow/rocm/169564 2025-12-04T12:53:07.9781554Z * [new tag] ciflow/rocm/169566 -> ciflow/rocm/169566 2025-12-04T12:53:07.9781627Z t [tag update] ciflow/trunk/169385 -> ciflow/trunk/169385 2025-12-04T12:53:07.9781694Z t [tag update] ciflow/trunk/169437 -> ciflow/trunk/169437 2025-12-04T12:53:07.9781840Z * [new tag] trunk/a2b5dfb956aed182f6aefce1ff2eda70c35049e1 -> trunk/a2b5dfb956aed182f6aefce1ff2eda70c35049e1 2025-12-04T12:53:08.1765707Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T12:53:08.1959546Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:08.1964214Z ##[endgroup] 2025-12-04T12:53:08.1964459Z ##[group]Determining the checkout info 2025-12-04T12:53:08.1965099Z ##[endgroup] 2025-12-04T12:53:08.1969999Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T12:53:08.2062173Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T12:53:08.2088770Z ##[group]Checking out the ref 2025-12-04T12:53:08.2090630Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:08.3010784Z Previous HEAD position was 685ba6bc0117 add back legalize_graph for BC reason (#169541) 2025-12-04T12:53:08.3015884Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T12:53:08.3124090Z ##[endgroup] 2025-12-04T12:53:08.3124603Z ##[group]Setting up auth for fetching submodules 2025-12-04T12:53:08.3129268Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T12:53:08.3159791Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T12:53:08.3175232Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T12:53:08.3190125Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T12:53:08.3212291Z ##[endgroup] 2025-12-04T12:53:08.3212516Z ##[group]Fetching submodules 2025-12-04T12:53:08.3213994Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T12:53:08.3431026Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T12:53:08.3448245Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T12:53:08.3461367Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T12:53:08.3474240Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T12:53:08.3485674Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T12:53:08.3497523Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:08.3509526Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T12:53:08.3528421Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T12:53:08.3541561Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:08.3561751Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T12:53:08.3577909Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T12:53:08.3591997Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T12:53:08.3603658Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T12:53:08.3621013Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T12:53:08.3632314Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T12:53:08.3646030Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T12:53:08.3656506Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:08.3668249Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:08.3690052Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:08.3710373Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:08.3740349Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:08.3754841Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:08.3767717Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T12:53:08.3782802Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T12:53:08.3804573Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:08.3818131Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:08.3834527Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T12:53:08.3849486Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T12:53:08.3867093Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:08.3878862Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T12:53:08.3890420Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T12:53:08.3901340Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T12:53:08.3916761Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:08.3931362Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T12:53:08.3942560Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T12:53:08.3952756Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:08.3963969Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:08.3976224Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:08.3987219Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:08.3998330Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:08.4013132Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:08.4030107Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:08.4041049Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:08.4052734Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:08.4062480Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:08.4071758Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:08.4081470Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:08.4093320Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:08.4109825Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:08.4120078Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:08.4132843Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T12:53:08.4143185Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T12:53:08.4159355Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T12:53:08.4170300Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T12:53:08.4189068Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:08.4201822Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T12:53:08.4213884Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:08.4230932Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:08.4242061Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:08.4254252Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:08.4265411Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:08.4277807Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:08.4288050Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:08.4300449Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:08.4310997Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:08.4323161Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:08.4348576Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T12:53:08.4361188Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T12:53:08.4373636Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:08.4383575Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:08.4394932Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T12:53:08.4405953Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T12:53:08.4415380Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T12:53:08.4424987Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T12:53:08.4435881Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T12:53:08.4446636Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T12:53:08.4458451Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:08.4468333Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:08.4478312Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:08.4491039Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:08.4506260Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:08.4532545Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T12:53:08.4771885Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T12:53:08.4843168Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T12:53:08.4890428Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T12:53:08.5009346Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T12:53:08.5071714Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T12:53:08.5122491Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T12:53:08.9928731Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T12:53:09.0104958Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T12:53:09.0322151Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T12:53:09.0444987Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T12:53:09.0644688Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T12:53:09.0728053Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T12:53:09.1364262Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T12:53:09.1449061Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T12:53:09.1592014Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T12:53:09.2296163Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T12:53:09.2596163Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T12:53:09.4376991Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T12:53:09.5022363Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T12:53:09.7384247Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T12:53:09.7617617Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:09.7709705Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T12:53:09.8252632Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T12:53:09.8359476Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T12:53:09.8551281Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T12:53:09.8675392Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T12:53:09.8786718Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T12:53:09.8940799Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T12:53:09.9141166Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T12:53:09.9271856Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T12:53:09.9456038Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:09.9532581Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T12:53:10.2918664Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T12:53:10.3033001Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T12:53:10.3139455Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T12:53:10.3242568Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T12:53:10.3324823Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T12:53:10.3394354Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T12:53:10.3474215Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T12:53:10.3537730Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T12:53:10.3601623Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T12:53:10.3671896Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T12:53:10.3724707Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:10.3802807Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T12:53:10.3858642Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T12:53:10.3927968Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T12:53:10.4007642Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T12:53:10.4071346Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T12:53:10.4132647Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T12:53:10.4195979Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:10.4266939Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T12:53:10.4330240Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T12:53:10.4415498Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T12:53:10.6094791Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T12:53:10.6295353Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T12:53:10.6410499Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T12:53:10.6504857Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T12:53:10.6573203Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T12:53:10.6636767Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T12:53:10.6709916Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T12:53:10.6777056Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T12:53:10.6837636Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T12:53:10.6918576Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T12:53:10.7009744Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T12:53:10.7075457Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T12:53:10.7223055Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T12:53:10.7294927Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T12:53:10.8574015Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T12:53:10.8670150Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T12:53:10.8893364Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T12:53:10.8957650Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T12:53:10.9041452Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T12:53:10.9217910Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T12:53:10.9440038Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T12:53:10.9684094Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T12:53:10.9790277Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T12:53:10.9989744Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T12:53:11.0084733Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T12:53:11.0368631Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T12:53:11.0518934Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T12:53:11.0586137Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T12:53:11.0627693Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T12:53:11.0874852Z Entering 'android/libs/fbjni' 2025-12-04T12:53:11.0908751Z Entering 'third_party/FP16' 2025-12-04T12:53:11.0943886Z Entering 'third_party/FXdiv' 2025-12-04T12:53:11.0980436Z Entering 'third_party/NNPACK' 2025-12-04T12:53:11.1016956Z Entering 'third_party/NVTX' 2025-12-04T12:53:11.1050049Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:11.1079236Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:11.1123160Z Entering 'third_party/aiter' 2025-12-04T12:53:11.1160876Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:11.1195526Z Entering 'third_party/benchmark' 2025-12-04T12:53:11.1226589Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:11.1256589Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:11.1288426Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:11.1313469Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:11.1341543Z Entering 'third_party/cutlass' 2025-12-04T12:53:11.1367253Z Entering 'third_party/fbgemm' 2025-12-04T12:53:11.1397909Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:11.1422539Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:11.1459874Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:11.1486581Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:11.1524097Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:11.1554366Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:11.1575926Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:11.1602825Z Entering 'third_party/flash-attention' 2025-12-04T12:53:11.1627505Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:11.1654421Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:11.1687197Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:11.1712948Z Entering 'third_party/fmt' 2025-12-04T12:53:11.1738446Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:11.1763367Z Entering 'third_party/gloo' 2025-12-04T12:53:11.1788155Z Entering 'third_party/googletest' 2025-12-04T12:53:11.1811888Z Entering 'third_party/ideep' 2025-12-04T12:53:11.1836822Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:11.1860887Z Entering 'third_party/ittapi' 2025-12-04T12:53:11.1888068Z Entering 'third_party/kineto' 2025-12-04T12:53:11.1912616Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:11.1932645Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:11.1957850Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:11.1986095Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:11.2008403Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:11.2033197Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:11.2060741Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:11.2081946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:11.2103600Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:11.2129301Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:11.2148874Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:11.2173530Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:11.2200581Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:11.2222005Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:11.2245291Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:11.2270449Z Entering 'third_party/kleidiai' 2025-12-04T12:53:11.2294957Z Entering 'third_party/mimalloc' 2025-12-04T12:53:11.2321160Z Entering 'third_party/nlohmann' 2025-12-04T12:53:11.2343630Z Entering 'third_party/onnx' 2025-12-04T12:53:11.2376866Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:11.2401764Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:11.2432331Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:11.2462410Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:11.2487699Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:11.2514474Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:11.2541910Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:11.2573353Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:11.2595954Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:11.2632407Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:11.2654688Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:11.2685492Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:11.2716088Z Entering 'third_party/pocketfft' 2025-12-04T12:53:11.2742390Z Entering 'third_party/protobuf' 2025-12-04T12:53:11.2769491Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:11.2795906Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:11.2819609Z Entering 'third_party/psimd' 2025-12-04T12:53:11.2844156Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:11.2868012Z Entering 'third_party/pybind11' 2025-12-04T12:53:11.2898800Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:11.2925231Z Entering 'third_party/sleef' 2025-12-04T12:53:11.2948597Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:11.2977462Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:11.3001818Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:11.3029083Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:11.3051581Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:11.3082505Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:11.3126740Z ##[endgroup] 2025-12-04T12:53:11.3126982Z ##[group]Persisting credentials for submodules 2025-12-04T12:53:11.3135186Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T12:53:11.3316685Z Entering 'android/libs/fbjni' 2025-12-04T12:53:11.3343416Z Entering 'third_party/FP16' 2025-12-04T12:53:11.3369467Z Entering 'third_party/FXdiv' 2025-12-04T12:53:11.3393423Z Entering 'third_party/NNPACK' 2025-12-04T12:53:11.3418645Z Entering 'third_party/NVTX' 2025-12-04T12:53:11.3445246Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:11.3474713Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:11.3507782Z Entering 'third_party/aiter' 2025-12-04T12:53:11.3536236Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:11.3564992Z Entering 'third_party/benchmark' 2025-12-04T12:53:11.3591586Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:11.3624868Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:11.3648346Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:11.3680486Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:11.3718316Z Entering 'third_party/cutlass' 2025-12-04T12:53:11.3744354Z Entering 'third_party/fbgemm' 2025-12-04T12:53:11.3770398Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:11.3794918Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:11.3825800Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:11.3853635Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:11.3883559Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:11.3915121Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:11.3939429Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:11.3963035Z Entering 'third_party/flash-attention' 2025-12-04T12:53:11.3988279Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:11.4020165Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:11.4049374Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:11.4077217Z Entering 'third_party/fmt' 2025-12-04T12:53:11.4099563Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:11.4123347Z Entering 'third_party/gloo' 2025-12-04T12:53:11.4151111Z Entering 'third_party/googletest' 2025-12-04T12:53:11.4175331Z Entering 'third_party/ideep' 2025-12-04T12:53:11.4205978Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:11.4233295Z Entering 'third_party/ittapi' 2025-12-04T12:53:11.4257830Z Entering 'third_party/kineto' 2025-12-04T12:53:11.4281174Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:11.4310583Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:11.4335779Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:11.4370359Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:11.4397626Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:11.4418044Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:11.4445002Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:11.4473860Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:11.4505631Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:11.4530515Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:11.4553096Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:11.4574680Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:11.4599086Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:11.4632424Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:11.4654442Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:11.4683794Z Entering 'third_party/kleidiai' 2025-12-04T12:53:11.4707051Z Entering 'third_party/mimalloc' 2025-12-04T12:53:11.4735651Z Entering 'third_party/nlohmann' 2025-12-04T12:53:11.4760452Z Entering 'third_party/onnx' 2025-12-04T12:53:11.4791654Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:11.4819794Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:11.4847076Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:11.4871185Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:11.4893953Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:11.4913727Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:11.4936177Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:11.4961675Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:11.4988717Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:11.5010802Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:11.5040555Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:11.5071673Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:11.5107581Z Entering 'third_party/pocketfft' 2025-12-04T12:53:11.5133099Z Entering 'third_party/protobuf' 2025-12-04T12:53:11.5158707Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:11.5181522Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:11.5206592Z Entering 'third_party/psimd' 2025-12-04T12:53:11.5230511Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:11.5260020Z Entering 'third_party/pybind11' 2025-12-04T12:53:11.5280995Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:11.5309890Z Entering 'third_party/sleef' 2025-12-04T12:53:11.5335353Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:11.5364666Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:11.5391071Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:11.5413975Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:11.5435661Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:11.5458925Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:11.5494631Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T12:53:11.5721015Z Entering 'android/libs/fbjni' 2025-12-04T12:53:11.5766437Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T12:53:11.5778281Z Entering 'third_party/FP16' 2025-12-04T12:53:11.5810053Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T12:53:11.5826267Z Entering 'third_party/FXdiv' 2025-12-04T12:53:11.5856327Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T12:53:11.5871238Z Entering 'third_party/NNPACK' 2025-12-04T12:53:11.5897731Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T12:53:11.5908517Z Entering 'third_party/NVTX' 2025-12-04T12:53:11.5940902Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T12:53:11.5955092Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:11.5981568Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T12:53:11.5998200Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:11.6030930Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T12:53:11.6049724Z Entering 'third_party/aiter' 2025-12-04T12:53:11.6071951Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T12:53:11.6086220Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:11.6109512Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T12:53:11.6130294Z Entering 'third_party/benchmark' 2025-12-04T12:53:11.6162823Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:11.6172807Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:11.6199790Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T12:53:11.6214687Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:11.6242079Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T12:53:11.6252939Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:11.6274466Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T12:53:11.6289112Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:11.6312363Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T12:53:11.6323454Z Entering 'third_party/cutlass' 2025-12-04T12:53:11.6346518Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T12:53:11.6359319Z Entering 'third_party/fbgemm' 2025-12-04T12:53:11.6382169Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T12:53:11.6396374Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:11.6422435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T12:53:11.6433749Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:11.6456157Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T12:53:11.6469052Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:11.6495883Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T12:53:11.6510475Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:11.6533712Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T12:53:11.6553779Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:11.6577200Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T12:53:11.6589916Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:11.6613116Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T12:53:11.6629878Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:11.6653190Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T12:53:11.6669343Z Entering 'third_party/flash-attention' 2025-12-04T12:53:11.6690563Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T12:53:11.6702477Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:11.6729734Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T12:53:11.6741771Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:11.6763707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T12:53:11.6783592Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:11.6804647Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T12:53:11.6818613Z Entering 'third_party/fmt' 2025-12-04T12:53:11.6839655Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:11.6849717Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:11.6875102Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T12:53:11.6888851Z Entering 'third_party/gloo' 2025-12-04T12:53:11.6911210Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T12:53:11.6925981Z Entering 'third_party/googletest' 2025-12-04T12:53:11.6947322Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:11.6960287Z Entering 'third_party/ideep' 2025-12-04T12:53:11.6979822Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T12:53:11.6988025Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:11.7017532Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T12:53:11.7033061Z Entering 'third_party/ittapi' 2025-12-04T12:53:11.7053041Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T12:53:11.7063846Z Entering 'third_party/kineto' 2025-12-04T12:53:11.7100763Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T12:53:11.7114410Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:11.7144671Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T12:53:11.7162144Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:11.7194249Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T12:53:11.7212734Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:11.7237983Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T12:53:11.7252864Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:11.7276461Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:11.7293354Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:11.7312680Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T12:53:11.7324475Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:11.7350389Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T12:53:11.7363373Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:11.7388305Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T12:53:11.7399183Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:11.7422080Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:11.7433057Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:11.7456757Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T12:53:11.7473180Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:11.7502933Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T12:53:11.7513453Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:11.7535396Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:11.7552493Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:11.7578694Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:11.7590081Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:11.7612554Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:11.7626612Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:11.7659652Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T12:53:11.7676076Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:11.7700604Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T12:53:11.7715277Z Entering 'third_party/kleidiai' 2025-12-04T12:53:11.7743000Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T12:53:11.7754451Z Entering 'third_party/mimalloc' 2025-12-04T12:53:11.7783967Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T12:53:11.7795363Z Entering 'third_party/nlohmann' 2025-12-04T12:53:11.7815398Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T12:53:11.7832136Z Entering 'third_party/onnx' 2025-12-04T12:53:11.7855774Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T12:53:11.7879434Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:11.7904189Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:11.7916400Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:11.7938920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T12:53:11.7950517Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:11.7972968Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:11.7983585Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:11.8011023Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:11.8022226Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:11.8043913Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T12:53:11.8052333Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:11.8081905Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T12:53:11.8097100Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:11.8139199Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T12:53:11.8151579Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:11.8175980Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T12:53:11.8186516Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:11.8212014Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:11.8224238Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:11.8247365Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:11.8259160Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:11.8282973Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:11.8295636Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:11.8325423Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T12:53:11.8345930Z Entering 'third_party/pocketfft' 2025-12-04T12:53:11.8368172Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T12:53:11.8381894Z Entering 'third_party/protobuf' 2025-12-04T12:53:11.8408123Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T12:53:11.8425491Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:11.8452361Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:11.8467089Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:11.8491848Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:11.8504452Z Entering 'third_party/psimd' 2025-12-04T12:53:11.8530559Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T12:53:11.8542689Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:11.8564895Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T12:53:11.8580799Z Entering 'third_party/pybind11' 2025-12-04T12:53:11.8609395Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:11.8628695Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:11.8657719Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T12:53:11.8670308Z Entering 'third_party/sleef' 2025-12-04T12:53:11.8695274Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T12:53:11.8704966Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:11.8729779Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T12:53:11.8747469Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:11.8777882Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:11.8791070Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:11.8812268Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T12:53:11.8825823Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:11.8850343Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T12:53:11.8863355Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:11.8885227Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:11.8898516Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:11.8923539Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T12:53:11.9672053Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T12:53:11.9874988Z Entering 'android/libs/fbjni' 2025-12-04T12:53:11.9887354Z Entering 'third_party/FP16' 2025-12-04T12:53:11.9907640Z Entering 'third_party/FXdiv' 2025-12-04T12:53:11.9928735Z Entering 'third_party/NNPACK' 2025-12-04T12:53:11.9952309Z Entering 'third_party/NVTX' 2025-12-04T12:53:11.9982099Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:12.0004382Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:12.0035583Z Entering 'third_party/aiter' 2025-12-04T12:53:12.0058216Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:12.0094199Z Entering 'third_party/benchmark' 2025-12-04T12:53:12.0115793Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:12.0147851Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:12.0169980Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:12.0205290Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:12.0232221Z Entering 'third_party/cutlass' 2025-12-04T12:53:12.0264362Z Entering 'third_party/fbgemm' 2025-12-04T12:53:12.0291225Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:12.0313577Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:12.0338895Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:12.0374701Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:12.0397996Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:12.0416884Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:12.0436618Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:12.0459803Z Entering 'third_party/flash-attention' 2025-12-04T12:53:12.0480590Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:12.0515400Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:12.0544024Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:12.0567111Z Entering 'third_party/fmt' 2025-12-04T12:53:12.0600360Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:12.0634240Z Entering 'third_party/gloo' 2025-12-04T12:53:12.0666096Z Entering 'third_party/googletest' 2025-12-04T12:53:12.0694542Z Entering 'third_party/ideep' 2025-12-04T12:53:12.0727602Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:12.0755557Z Entering 'third_party/ittapi' 2025-12-04T12:53:12.0783191Z Entering 'third_party/kineto' 2025-12-04T12:53:12.0813041Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:12.0838891Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:12.0868826Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:12.0895305Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:12.0918156Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:12.0943440Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:12.0965595Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:12.0988060Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:12.1007433Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:12.1033881Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:12.1059321Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:12.1085686Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:12.1107642Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:12.1132066Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:12.1152674Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:12.1182992Z Entering 'third_party/kleidiai' 2025-12-04T12:53:12.1203594Z Entering 'third_party/mimalloc' 2025-12-04T12:53:12.1227116Z Entering 'third_party/nlohmann' 2025-12-04T12:53:12.1254692Z Entering 'third_party/onnx' 2025-12-04T12:53:12.1283404Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:12.1313280Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:12.1347568Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:12.1377209Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:12.1398230Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:12.1420140Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:12.1444306Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:12.1470660Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:12.1500147Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:12.1525803Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:12.1549391Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:12.1572778Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:12.1602507Z Entering 'third_party/pocketfft' 2025-12-04T12:53:12.1624511Z Entering 'third_party/protobuf' 2025-12-04T12:53:12.1646186Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:12.1667962Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:12.1692700Z Entering 'third_party/psimd' 2025-12-04T12:53:12.1713562Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:12.1735806Z Entering 'third_party/pybind11' 2025-12-04T12:53:12.1757931Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:12.1779806Z Entering 'third_party/sleef' 2025-12-04T12:53:12.1803276Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:12.1825460Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:12.1843806Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:12.1864919Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:12.1885471Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:12.1908143Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:12.1947408Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T12:53:12.2108402Z Entering 'android/libs/fbjni' 2025-12-04T12:53:12.2128570Z Entering 'third_party/FP16' 2025-12-04T12:53:12.2148050Z Entering 'third_party/FXdiv' 2025-12-04T12:53:12.2171321Z Entering 'third_party/NNPACK' 2025-12-04T12:53:12.2192042Z Entering 'third_party/NVTX' 2025-12-04T12:53:12.2212040Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:12.2233421Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:12.2258492Z Entering 'third_party/aiter' 2025-12-04T12:53:12.2278173Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:12.2302301Z Entering 'third_party/benchmark' 2025-12-04T12:53:12.2323653Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:12.2350742Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:12.2377777Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:12.2402995Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:12.2428995Z Entering 'third_party/cutlass' 2025-12-04T12:53:12.2453842Z Entering 'third_party/fbgemm' 2025-12-04T12:53:12.2476528Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:12.2497961Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:12.2525363Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:12.2546568Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:12.2568938Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:12.2587384Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:12.2606269Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:12.2628300Z Entering 'third_party/flash-attention' 2025-12-04T12:53:12.2648479Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:12.2674349Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:12.2701458Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:12.2726391Z Entering 'third_party/fmt' 2025-12-04T12:53:12.2747295Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:12.2768686Z Entering 'third_party/gloo' 2025-12-04T12:53:12.2794958Z Entering 'third_party/googletest' 2025-12-04T12:53:12.2815165Z Entering 'third_party/ideep' 2025-12-04T12:53:12.2836482Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:12.2870906Z Entering 'third_party/ittapi' 2025-12-04T12:53:12.2892126Z Entering 'third_party/kineto' 2025-12-04T12:53:12.2912287Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:12.2930865Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:12.2966610Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:12.2986356Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:12.3011053Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:12.3031613Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:12.3053915Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:12.3072270Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:12.3095814Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:12.3124283Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:12.3149544Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:12.3169197Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:12.3199844Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:12.3226570Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:12.3247495Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:12.3275990Z Entering 'third_party/kleidiai' 2025-12-04T12:53:12.3297646Z Entering 'third_party/mimalloc' 2025-12-04T12:53:12.3320345Z Entering 'third_party/nlohmann' 2025-12-04T12:53:12.3341601Z Entering 'third_party/onnx' 2025-12-04T12:53:12.3368457Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:12.3396474Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:12.3418990Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:12.3440377Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:12.3462023Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:12.3482103Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:12.3503479Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:12.3523221Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:12.3541514Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:12.3560973Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:12.3590509Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:12.3612253Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:12.3641227Z Entering 'third_party/pocketfft' 2025-12-04T12:53:12.3664771Z Entering 'third_party/protobuf' 2025-12-04T12:53:12.3686884Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:12.3710715Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:12.3733494Z Entering 'third_party/psimd' 2025-12-04T12:53:12.3754034Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:12.3774541Z Entering 'third_party/pybind11' 2025-12-04T12:53:12.3794913Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:12.3817162Z Entering 'third_party/sleef' 2025-12-04T12:53:12.3838002Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:12.3857091Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:12.3876585Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:12.3895977Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:12.3917002Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:12.3936175Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:12.3972441Z ##[endgroup] 2025-12-04T12:53:12.4158402Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T12:53:12.4258419Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:12.4405107Z ##[group]Run actions/checkout@v4 2025-12-04T12:53:12.4405233Z with: 2025-12-04T12:53:12.4405340Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:12.4405472Z fetch-depth: 0 2025-12-04T12:53:12.4405565Z submodules: recursive 2025-12-04T12:53:12.4405670Z show-progress: false 2025-12-04T12:53:12.4405785Z repository: pytorch/pytorch 2025-12-04T12:53:12.4405930Z token: *** 2025-12-04T12:53:12.4406029Z ssh-strict: true 2025-12-04T12:53:12.4406120Z ssh-user: git 2025-12-04T12:53:12.4406218Z persist-credentials: true 2025-12-04T12:53:12.4406324Z clean: true 2025-12-04T12:53:12.4406422Z sparse-checkout-cone-mode: true 2025-12-04T12:53:12.4406539Z fetch-tags: false 2025-12-04T12:53:12.4406628Z lfs: false 2025-12-04T12:53:12.4406720Z set-safe-directory: true 2025-12-04T12:53:12.4406824Z env: 2025-12-04T12:53:12.4406907Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:12.4407008Z ##[endgroup] 2025-12-04T12:53:12.4881330Z Syncing repository: pytorch/pytorch 2025-12-04T12:53:12.4881663Z ##[group]Getting Git version info 2025-12-04T12:53:12.4881886Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T12:53:12.4896221Z [command]/usr/bin/git version 2025-12-04T12:53:12.4923813Z git version 2.52.0 2025-12-04T12:53:12.4944406Z ##[endgroup] 2025-12-04T12:53:12.4950594Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/bf1b6bcf-ad41-4f94-a4c0-31ab7423eff2/.gitconfig' 2025-12-04T12:53:12.4957062Z Temporarily overriding HOME='/home/runner/_work/_temp/bf1b6bcf-ad41-4f94-a4c0-31ab7423eff2' before making global git config changes 2025-12-04T12:53:12.4957399Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T12:53:12.4959889Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T12:53:12.4982172Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T12:53:12.4996430Z https://github.com/pytorch/pytorch 2025-12-04T12:53:12.5014612Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T12:53:12.5018556Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T12:53:12.5033437Z HEAD 2025-12-04T12:53:12.5070655Z ##[endgroup] 2025-12-04T12:53:12.5073091Z [command]/usr/bin/git submodule status 2025-12-04T12:53:12.5290008Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T12:53:12.5339976Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T12:53:12.5396404Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T12:53:12.5473432Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T12:53:12.5506302Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T12:53:12.5578733Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T12:53:12.5912408Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T12:53:12.5946282Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T12:53:12.5969612Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T12:53:12.6034959Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T12:53:12.6122743Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T12:53:12.6216548Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T12:53:12.6238333Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T12:53:12.6305828Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T12:53:12.6331339Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T12:53:12.6386395Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T12:53:12.6405682Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T12:53:12.6645598Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T12:53:12.6720453Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T12:53:12.6804060Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T12:53:12.6942492Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T12:53:12.7003359Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T12:53:12.7053570Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T12:53:12.7194428Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T12:53:12.7220015Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T12:53:12.7235680Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T12:53:12.7265586Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T12:53:12.7468921Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T12:53:12.7491825Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T12:53:12.7507600Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T12:53:12.7711355Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T12:53:12.7761487Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T12:53:12.7814580Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T12:53:12.7836336Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T12:53:12.7893594Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T12:53:12.7947784Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T12:53:12.8001237Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T12:53:12.8013338Z ##[group]Cleaning the repository 2025-12-04T12:53:12.8019106Z [command]/usr/bin/git clean -ffdx 2025-12-04T12:53:12.8142821Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T12:53:12.9007892Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T12:53:12.9082132Z ##[endgroup] 2025-12-04T12:53:12.9084540Z ##[group]Disabling automatic garbage collection 2025-12-04T12:53:12.9088729Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T12:53:12.9109543Z ##[endgroup] 2025-12-04T12:53:12.9109768Z ##[group]Setting up auth 2025-12-04T12:53:12.9113938Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T12:53:12.9145305Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T12:53:12.9341471Z Entering 'android/libs/fbjni' 2025-12-04T12:53:12.9368420Z Entering 'third_party/FP16' 2025-12-04T12:53:12.9391166Z Entering 'third_party/FXdiv' 2025-12-04T12:53:12.9415243Z Entering 'third_party/NNPACK' 2025-12-04T12:53:12.9443444Z Entering 'third_party/NVTX' 2025-12-04T12:53:12.9471709Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:12.9492560Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:12.9518148Z Entering 'third_party/aiter' 2025-12-04T12:53:12.9543360Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:12.9576419Z Entering 'third_party/benchmark' 2025-12-04T12:53:12.9598569Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:12.9631563Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:12.9655853Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:12.9681821Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:12.9706031Z Entering 'third_party/cutlass' 2025-12-04T12:53:12.9735070Z Entering 'third_party/fbgemm' 2025-12-04T12:53:12.9757650Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:12.9795061Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:12.9838479Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:12.9864215Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:12.9897087Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:12.9929001Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:12.9952152Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:12.9976591Z Entering 'third_party/flash-attention' 2025-12-04T12:53:13.0005460Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:13.0029366Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:13.0053901Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:13.0082949Z Entering 'third_party/fmt' 2025-12-04T12:53:13.0111638Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:13.0135576Z Entering 'third_party/gloo' 2025-12-04T12:53:13.0158775Z Entering 'third_party/googletest' 2025-12-04T12:53:13.0179160Z Entering 'third_party/ideep' 2025-12-04T12:53:13.0204017Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:13.0229779Z Entering 'third_party/ittapi' 2025-12-04T12:53:13.0254442Z Entering 'third_party/kineto' 2025-12-04T12:53:13.0285125Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:13.0324431Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:13.0349872Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:13.0374865Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:13.0404107Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:13.0444851Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:13.0483495Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:13.0513451Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:13.0562214Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:13.0590490Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:13.0615799Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:13.0641228Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:13.0667822Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:13.0692909Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:13.0724445Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:13.0760944Z Entering 'third_party/kleidiai' 2025-12-04T12:53:13.0797174Z Entering 'third_party/mimalloc' 2025-12-04T12:53:13.0825560Z Entering 'third_party/nlohmann' 2025-12-04T12:53:13.0854088Z Entering 'third_party/onnx' 2025-12-04T12:53:13.0883903Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:13.0908945Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:13.0939636Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:13.0975351Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:13.1000270Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:13.1030010Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:13.1056742Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:13.1077618Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:13.1104481Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:13.1127390Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:13.1160421Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:13.1196464Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:13.1227934Z Entering 'third_party/pocketfft' 2025-12-04T12:53:13.1250052Z Entering 'third_party/protobuf' 2025-12-04T12:53:13.1275266Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:13.1314363Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:13.1345908Z Entering 'third_party/psimd' 2025-12-04T12:53:13.1371220Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:13.1400004Z Entering 'third_party/pybind11' 2025-12-04T12:53:13.1420430Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:13.1449357Z Entering 'third_party/sleef' 2025-12-04T12:53:13.1472246Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:13.1505815Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:13.1538174Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:13.1567897Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:13.1592535Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:13.1624963Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:13.1669574Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T12:53:13.1687551Z http.https://github.com/.extraheader 2025-12-04T12:53:13.1697513Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T12:53:13.1720426Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T12:53:13.1894949Z Entering 'android/libs/fbjni' 2025-12-04T12:53:13.1917172Z http.https://github.com/.extraheader 2025-12-04T12:53:13.1938210Z Entering 'third_party/FP16' 2025-12-04T12:53:13.1952574Z http.https://github.com/.extraheader 2025-12-04T12:53:13.1973996Z Entering 'third_party/FXdiv' 2025-12-04T12:53:13.1988950Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2012122Z Entering 'third_party/NNPACK' 2025-12-04T12:53:13.2026245Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2044149Z Entering 'third_party/NVTX' 2025-12-04T12:53:13.2060282Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2083739Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:13.2097690Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2116165Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:13.2134482Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2157553Z Entering 'third_party/aiter' 2025-12-04T12:53:13.2174223Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2192980Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:13.2206043Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2228724Z Entering 'third_party/benchmark' 2025-12-04T12:53:13.2244141Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2261649Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:13.2279304Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2308237Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:13.2322643Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2343929Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:13.2363021Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2380496Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:13.2393885Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2418417Z Entering 'third_party/cutlass' 2025-12-04T12:53:13.2433022Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2462445Z Entering 'third_party/fbgemm' 2025-12-04T12:53:13.2476996Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2508703Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:13.2528333Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2548616Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:13.2561861Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2581916Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:13.2598526Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2618938Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:13.2632953Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2655170Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:13.2667038Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2689288Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:13.2706284Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2725754Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:13.2739644Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2760513Z Entering 'third_party/flash-attention' 2025-12-04T12:53:13.2777401Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2802601Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:13.2815951Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2841618Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:13.2855561Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2880560Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:13.2894794Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2917327Z Entering 'third_party/fmt' 2025-12-04T12:53:13.2930861Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2949025Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:13.2963124Z http.https://github.com/.extraheader 2025-12-04T12:53:13.2983867Z Entering 'third_party/gloo' 2025-12-04T12:53:13.2997551Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3015216Z Entering 'third_party/googletest' 2025-12-04T12:53:13.3028078Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3046801Z Entering 'third_party/ideep' 2025-12-04T12:53:13.3066966Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3084824Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:13.3098745Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3124518Z Entering 'third_party/ittapi' 2025-12-04T12:53:13.3140972Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3164007Z Entering 'third_party/kineto' 2025-12-04T12:53:13.3180286Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3203785Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:13.3217568Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3234183Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:13.3259791Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3286963Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:13.3301029Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3321017Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:13.3348091Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3365608Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:13.3378447Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3398507Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:13.3415526Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3442891Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:13.3468807Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3495994Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:13.3517500Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3539406Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:13.3559059Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3580734Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:13.3599237Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3618632Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:13.3633388Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3651420Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:13.3664527Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3682172Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:13.3696711Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3727345Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:13.3740418Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3761845Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:13.3773553Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3800421Z Entering 'third_party/kleidiai' 2025-12-04T12:53:13.3814312Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3836754Z Entering 'third_party/mimalloc' 2025-12-04T12:53:13.3857589Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3883795Z Entering 'third_party/nlohmann' 2025-12-04T12:53:13.3905787Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3924428Z Entering 'third_party/onnx' 2025-12-04T12:53:13.3938508Z http.https://github.com/.extraheader 2025-12-04T12:53:13.3969663Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:13.3994066Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4020084Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:13.4036414Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4056681Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:13.4074935Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4093437Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:13.4108252Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4133560Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:13.4154836Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4173989Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:13.4194201Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4211593Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:13.4227092Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4245716Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:13.4264090Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4281678Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:13.4304079Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4322374Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:13.4337069Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4357348Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:13.4373620Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4395671Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:13.4410144Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4436312Z Entering 'third_party/pocketfft' 2025-12-04T12:53:13.4451534Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4470353Z Entering 'third_party/protobuf' 2025-12-04T12:53:13.4482792Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4503043Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:13.4517449Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4536042Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:13.4550512Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4571917Z Entering 'third_party/psimd' 2025-12-04T12:53:13.4587559Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4607076Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:13.4621044Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4638790Z Entering 'third_party/pybind11' 2025-12-04T12:53:13.4655116Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4673983Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:13.4688574Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4707929Z Entering 'third_party/sleef' 2025-12-04T12:53:13.4722533Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4745966Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:13.4764062Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4781396Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:13.4799567Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4817527Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:13.4834067Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4856840Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:13.4871374Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4889055Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:13.4902379Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4919727Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:13.4934686Z http.https://github.com/.extraheader 2025-12-04T12:53:13.4977487Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.4999228Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T12:53:13.5170851Z Entering 'android/libs/fbjni' 2025-12-04T12:53:13.5181729Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T12:53:13.5192623Z Entering 'third_party/FP16' 2025-12-04T12:53:13.5205326Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T12:53:13.5216023Z Entering 'third_party/FXdiv' 2025-12-04T12:53:13.5226951Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T12:53:13.5236251Z Entering 'third_party/NNPACK' 2025-12-04T12:53:13.5246077Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T12:53:13.5254929Z Entering 'third_party/NVTX' 2025-12-04T12:53:13.5265263Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T12:53:13.5275312Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:13.5285673Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T12:53:13.5296703Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:13.5308780Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T12:53:13.5327693Z Entering 'third_party/aiter' 2025-12-04T12:53:13.5340462Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T12:53:13.5353273Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:13.5370414Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T12:53:13.5384216Z Entering 'third_party/benchmark' 2025-12-04T12:53:13.5394197Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:13.5402986Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:13.5412789Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T12:53:13.5426929Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:13.5437307Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T12:53:13.5446820Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:13.5457281Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T12:53:13.5466232Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:13.5476269Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T12:53:13.5489726Z Entering 'third_party/cutlass' 2025-12-04T12:53:13.5501556Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T12:53:13.5516861Z Entering 'third_party/fbgemm' 2025-12-04T12:53:13.5527940Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T12:53:13.5538813Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:13.5551116Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T12:53:13.5564850Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:13.5576207Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T12:53:13.5594921Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:13.5607192Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T12:53:13.5616845Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:13.5636688Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T12:53:13.5649117Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:13.5669383Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T12:53:13.5684342Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:13.5695754Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T12:53:13.5705940Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:13.5717522Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T12:53:13.5730000Z Entering 'third_party/flash-attention' 2025-12-04T12:53:13.5742078Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T12:53:13.5750926Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:13.5763828Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T12:53:13.5782375Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:13.5794328Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T12:53:13.5808642Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:13.5820451Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T12:53:13.5831241Z Entering 'third_party/fmt' 2025-12-04T12:53:13.5841435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:13.5850055Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:13.5859660Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T12:53:13.5868289Z Entering 'third_party/gloo' 2025-12-04T12:53:13.5878369Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T12:53:13.5887020Z Entering 'third_party/googletest' 2025-12-04T12:53:13.5896607Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:13.5905548Z Entering 'third_party/ideep' 2025-12-04T12:53:13.5917244Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T12:53:13.5934772Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:13.5953242Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T12:53:13.5966266Z Entering 'third_party/ittapi' 2025-12-04T12:53:13.5978823Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T12:53:13.5992428Z Entering 'third_party/kineto' 2025-12-04T12:53:13.6005136Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T12:53:13.6015277Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:13.6031307Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T12:53:13.6041170Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:13.6064250Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T12:53:13.6074157Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:13.6086163Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T12:53:13.6097991Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:13.6111838Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:13.6121655Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:13.6135135Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T12:53:13.6145954Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:13.6157382Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T12:53:13.6169303Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:13.6179188Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T12:53:13.6189728Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:13.6203299Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:13.6216564Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:13.6231815Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T12:53:13.6242159Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:13.6256477Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T12:53:13.6269206Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:13.6279810Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:13.6290104Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:13.6305560Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:13.6315568Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:13.6326758Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:13.6340340Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:13.6351114Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T12:53:13.6363940Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:13.6374679Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T12:53:13.6385926Z Entering 'third_party/kleidiai' 2025-12-04T12:53:13.6396266Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T12:53:13.6405930Z Entering 'third_party/mimalloc' 2025-12-04T12:53:13.6417896Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T12:53:13.6426902Z Entering 'third_party/nlohmann' 2025-12-04T12:53:13.6435381Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T12:53:13.6445497Z Entering 'third_party/onnx' 2025-12-04T12:53:13.6456245Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T12:53:13.6471357Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:13.6483517Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:13.6496681Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:13.6507278Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T12:53:13.6517413Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:13.6527700Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:13.6537671Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:13.6547743Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:13.6556035Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:13.6573101Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T12:53:13.6582061Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:13.6592609Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T12:53:13.6602324Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:13.6612487Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T12:53:13.6621522Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:13.6630802Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T12:53:13.6639744Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:13.6649939Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:13.6662047Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:13.6674349Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:13.6690347Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:13.6705951Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:13.6716935Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:13.6727113Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T12:53:13.6744628Z Entering 'third_party/pocketfft' 2025-12-04T12:53:13.6755174Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T12:53:13.6764633Z Entering 'third_party/protobuf' 2025-12-04T12:53:13.6776884Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T12:53:13.6791382Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:13.6801248Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:13.6813497Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:13.6824263Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:13.6838447Z Entering 'third_party/psimd' 2025-12-04T12:53:13.6848552Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T12:53:13.6858212Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:13.6868817Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T12:53:13.6880031Z Entering 'third_party/pybind11' 2025-12-04T12:53:13.6890292Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:13.6899957Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:13.6910434Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T12:53:13.6919708Z Entering 'third_party/sleef' 2025-12-04T12:53:13.6930389Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T12:53:13.6939578Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:13.6951089Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T12:53:13.6960337Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:13.6970229Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:13.6979470Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:13.6994068Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T12:53:13.7004943Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:13.7015362Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T12:53:13.7027084Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:13.7040399Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:13.7049321Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:13.7058837Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T12:53:13.7094091Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7115011Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7138982Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7159287Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7176637Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7194751Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7213271Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7228015Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7245007Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7259779Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7276342Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7292386Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7308132Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7331990Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7351413Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7366977Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7383798Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7397980Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7414629Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7430429Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7446091Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7461654Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7477066Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7493809Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7510260Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7527077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7542814Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7560238Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7582190Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7598062Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7614197Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7628257Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7644548Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7660994Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7686019Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7707265Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7724961Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7744239Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7765527Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7782099Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7798351Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7817803Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7833426Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7851941Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7867971Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7884832Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7900489Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7916558Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7932233Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7948205Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7966330Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7982444Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.7999516Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8016105Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8033597Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8049991Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8076478Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8095100Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8114060Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8131342Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8153883Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8171253Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8193714Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8214733Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8231667Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8248431Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8265644Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8287168Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8304923Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8321034Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8338061Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8355118Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8373496Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8390077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8407824Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8424365Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8440254Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8456092Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8473996Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8491105Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8507845Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:53:13.8526268Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T12:53:13.8551403Z ##[endgroup] 2025-12-04T12:53:13.8551649Z ##[group]Fetching the repository 2025-12-04T12:53:13.8555215Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T12:53:15.3287894Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T12:53:15.3421212Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:15.3426180Z ##[endgroup] 2025-12-04T12:53:15.3427038Z ##[group]Determining the checkout info 2025-12-04T12:53:15.3428301Z ##[endgroup] 2025-12-04T12:53:15.3434134Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T12:53:15.3530366Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T12:53:15.3560069Z ##[group]Checking out the ref 2025-12-04T12:53:15.3562372Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:15.3856639Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T12:53:15.3863767Z ##[endgroup] 2025-12-04T12:53:15.3864051Z ##[group]Setting up auth for fetching submodules 2025-12-04T12:53:15.3866939Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T12:53:15.3895064Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T12:53:15.3914021Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T12:53:15.3941307Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T12:53:15.3966403Z ##[endgroup] 2025-12-04T12:53:15.3966610Z ##[group]Fetching submodules 2025-12-04T12:53:15.3968055Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T12:53:15.4208241Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T12:53:15.4219667Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T12:53:15.4232526Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T12:53:15.4246229Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T12:53:15.4265373Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T12:53:15.4281969Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:15.4293639Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T12:53:15.4313256Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T12:53:15.4326708Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:15.4341558Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T12:53:15.4353282Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T12:53:15.4367180Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T12:53:15.4378831Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T12:53:15.4389937Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T12:53:15.4401699Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T12:53:15.4416421Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T12:53:15.4432595Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:15.4443415Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:15.4456985Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:15.4467918Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:15.4486668Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:15.4496956Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:15.4507606Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T12:53:15.4521831Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T12:53:15.4536018Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:15.4547673Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:15.4563098Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T12:53:15.4574469Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T12:53:15.4584628Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:15.4596012Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T12:53:15.4610440Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T12:53:15.4623600Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T12:53:15.4635509Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:15.4657720Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T12:53:15.4669131Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T12:53:15.4689261Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:15.4707745Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:15.4721239Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:15.4732954Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:15.4745045Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:15.4758817Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:15.4770733Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:15.4780915Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:15.4793169Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:15.4804726Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:15.4821343Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:15.4842403Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:15.4854104Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:15.4878038Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:15.4891429Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:15.4906348Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T12:53:15.4917840Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T12:53:15.4928082Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T12:53:15.4941780Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T12:53:15.4973048Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:15.4992690Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T12:53:15.5005083Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:15.5021509Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:15.5030738Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:15.5047132Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:15.5058381Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:15.5068981Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:15.5083498Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:15.5095423Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:15.5110580Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:15.5130037Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:15.5150444Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T12:53:15.5161702Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T12:53:15.5173256Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:15.5186543Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:15.5202077Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T12:53:15.5214592Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T12:53:15.5227444Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T12:53:15.5241283Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T12:53:15.5254731Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T12:53:15.5267078Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T12:53:15.5279855Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:15.5290570Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:15.5301419Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:15.5312141Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:15.5322815Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:15.5356246Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T12:53:15.5606278Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T12:53:15.5676254Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T12:53:15.5748311Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T12:53:15.5815459Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T12:53:15.5888897Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T12:53:15.5959606Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T12:53:15.6097213Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T12:53:15.6232107Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T12:53:15.6419055Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T12:53:15.6491947Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T12:53:15.6663843Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T12:53:15.6736416Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T12:53:15.6809552Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T12:53:15.6888848Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T12:53:15.6999108Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T12:53:15.7111308Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T12:53:15.7192853Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T12:53:15.7397457Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T12:53:15.7466466Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T12:53:15.7566419Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T12:53:15.7632166Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:15.7683540Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T12:53:15.7759167Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T12:53:15.7851684Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T12:53:15.8032248Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T12:53:15.8140957Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T12:53:15.8256085Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T12:53:15.8315642Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T12:53:15.8369054Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T12:53:15.8436165Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T12:53:15.8507108Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:15.8562617Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T12:53:15.8735175Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T12:53:15.8790098Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T12:53:15.8868462Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T12:53:15.8951765Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T12:53:15.9041483Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T12:53:15.9099393Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T12:53:15.9154400Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T12:53:15.9209189Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T12:53:15.9274349Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T12:53:15.9331128Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T12:53:15.9388346Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:15.9483457Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T12:53:15.9540945Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T12:53:15.9613095Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T12:53:15.9688919Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T12:53:15.9753202Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T12:53:15.9822957Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T12:53:15.9894457Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T12:53:16.0000443Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T12:53:16.0073237Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T12:53:16.0163917Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T12:53:16.0325298Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T12:53:16.0411430Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T12:53:16.0515796Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T12:53:16.0581027Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T12:53:16.0662124Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T12:53:16.0739662Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T12:53:16.0835417Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T12:53:16.0908940Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T12:53:16.0958577Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T12:53:16.1028159Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T12:53:16.1133035Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T12:53:16.1213160Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T12:53:16.1367588Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T12:53:16.1431288Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T12:53:16.1594734Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T12:53:16.1668439Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T12:53:16.1738461Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T12:53:16.1801385Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T12:53:16.1857841Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T12:53:16.1927669Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T12:53:16.1977763Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T12:53:16.2045835Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T12:53:16.2123920Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T12:53:16.2197871Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T12:53:16.2241579Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T12:53:16.2376848Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T12:53:16.2450817Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T12:53:16.2503818Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T12:53:16.2531966Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T12:53:16.2729997Z Entering 'android/libs/fbjni' 2025-12-04T12:53:16.2751659Z Entering 'third_party/FP16' 2025-12-04T12:53:16.2770685Z Entering 'third_party/FXdiv' 2025-12-04T12:53:16.2793126Z Entering 'third_party/NNPACK' 2025-12-04T12:53:16.2816026Z Entering 'third_party/NVTX' 2025-12-04T12:53:16.2838069Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:16.2864782Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:16.2889822Z Entering 'third_party/aiter' 2025-12-04T12:53:16.2916251Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:16.2942055Z Entering 'third_party/benchmark' 2025-12-04T12:53:16.2964142Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:16.2990533Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:16.3015818Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:16.3036430Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:16.3061825Z Entering 'third_party/cutlass' 2025-12-04T12:53:16.3086717Z Entering 'third_party/fbgemm' 2025-12-04T12:53:16.3108521Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:16.3129633Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:16.3154658Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:16.3181217Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:16.3211321Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:16.3239797Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:16.3266286Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:16.3289359Z Entering 'third_party/flash-attention' 2025-12-04T12:53:16.3311895Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:16.3332915Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:16.3356109Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:16.3378019Z Entering 'third_party/fmt' 2025-12-04T12:53:16.3398674Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:16.3419087Z Entering 'third_party/gloo' 2025-12-04T12:53:16.3443170Z Entering 'third_party/googletest' 2025-12-04T12:53:16.3468983Z Entering 'third_party/ideep' 2025-12-04T12:53:16.3493527Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:16.3524377Z Entering 'third_party/ittapi' 2025-12-04T12:53:16.3543815Z Entering 'third_party/kineto' 2025-12-04T12:53:16.3564964Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:16.3600535Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:16.3634469Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:16.3665911Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:16.3697360Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:16.3729594Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:16.3753504Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:16.3777697Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:16.3805993Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:16.3833513Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:16.3857492Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:16.3890076Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:16.3919968Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:16.3950097Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:16.3977740Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:16.4007398Z Entering 'third_party/kleidiai' 2025-12-04T12:53:16.4031029Z Entering 'third_party/mimalloc' 2025-12-04T12:53:16.4058707Z Entering 'third_party/nlohmann' 2025-12-04T12:53:16.4085964Z Entering 'third_party/onnx' 2025-12-04T12:53:16.4117534Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:16.4149204Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:16.4174156Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:16.4196386Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:16.4224793Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:16.4253884Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:16.4281531Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:16.4302889Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:16.4322147Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:16.4342403Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:16.4371688Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:16.4403353Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:16.4438404Z Entering 'third_party/pocketfft' 2025-12-04T12:53:16.4467130Z Entering 'third_party/protobuf' 2025-12-04T12:53:16.4491347Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:16.4522171Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:16.4543980Z Entering 'third_party/psimd' 2025-12-04T12:53:16.4564527Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:16.4585304Z Entering 'third_party/pybind11' 2025-12-04T12:53:16.4605287Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:16.4626634Z Entering 'third_party/sleef' 2025-12-04T12:53:16.4646762Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:16.4668681Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:16.4696113Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:16.4716409Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:16.4735745Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:16.4755208Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:16.4787301Z ##[endgroup] 2025-12-04T12:53:16.4787498Z ##[group]Persisting credentials for submodules 2025-12-04T12:53:16.4795848Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T12:53:16.5015477Z Entering 'android/libs/fbjni' 2025-12-04T12:53:16.5033258Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5033541Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5052398Z Entering 'third_party/FP16' 2025-12-04T12:53:16.5070612Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5070784Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5091011Z Entering 'third_party/FXdiv' 2025-12-04T12:53:16.5106540Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5106681Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5126340Z Entering 'third_party/NNPACK' 2025-12-04T12:53:16.5143760Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5143901Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5162083Z Entering 'third_party/NVTX' 2025-12-04T12:53:16.5178472Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5178613Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5194805Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:16.5207459Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5207592Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5222768Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:16.5235051Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5261483Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5261608Z Entering 'third_party/aiter' 2025-12-04T12:53:16.5274095Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5274228Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5289643Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:16.5308264Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5308578Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5333990Z Entering 'third_party/benchmark' 2025-12-04T12:53:16.5348430Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5348678Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5367080Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:16.5379554Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5379804Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5402458Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:16.5416695Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5416926Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5439703Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:16.5460904Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5461120Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5482068Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:16.5499560Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5499771Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5523868Z Entering 'third_party/cutlass' 2025-12-04T12:53:16.5546716Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5546916Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5572268Z Entering 'third_party/fbgemm' 2025-12-04T12:53:16.5595071Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5595265Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5619825Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:16.5638638Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5638933Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5665452Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:16.5684092Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5684381Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5712754Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:16.5737849Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5738068Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5767831Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:16.5782735Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5782928Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5812619Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:16.5829166Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5829360Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5847842Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:16.5870493Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5870688Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5889359Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:16.5906145Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5906316Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5926544Z Entering 'third_party/flash-attention' 2025-12-04T12:53:16.5939700Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5939863Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5962336Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:16.5977810Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5978166Z url.https://github.com/.insteadof 2025-12-04T12:53:16.5999171Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:16.6010798Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6011110Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6034696Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:16.6051888Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6052033Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6070715Z Entering 'third_party/fmt' 2025-12-04T12:53:16.6083645Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6083787Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6108433Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:16.6127306Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6127444Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6150350Z Entering 'third_party/gloo' 2025-12-04T12:53:16.6170614Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6170749Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6190473Z Entering 'third_party/googletest' 2025-12-04T12:53:16.6209673Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6209814Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6227841Z Entering 'third_party/ideep' 2025-12-04T12:53:16.6246469Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6246605Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6267791Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:16.6281242Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6281373Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6314397Z Entering 'third_party/ittapi' 2025-12-04T12:53:16.6336390Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6336546Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6360160Z Entering 'third_party/kineto' 2025-12-04T12:53:16.6374951Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6375108Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6394589Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:16.6410221Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6410374Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6437886Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:16.6456980Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6457110Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6478007Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:16.6492698Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6493033Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6509991Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:16.6523313Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6523781Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6542500Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:16.6553749Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6553960Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6574021Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:16.6590651Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6590824Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6614772Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:16.6633411Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6633592Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6660696Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:16.6678429Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6678594Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6698904Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:16.6712021Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6712175Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6733648Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:16.6749217Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6749385Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6778141Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:16.6791809Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6791977Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6818115Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:16.6835680Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6847031Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6855522Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:16.6872234Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6872532Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6897587Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:16.6914470Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6914609Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6932708Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:16.6947801Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6947935Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6966355Z Entering 'third_party/kleidiai' 2025-12-04T12:53:16.6981161Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6981284Z url.https://github.com/.insteadof 2025-12-04T12:53:16.6998676Z Entering 'third_party/mimalloc' 2025-12-04T12:53:16.7011932Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7012059Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7027396Z Entering 'third_party/nlohmann' 2025-12-04T12:53:16.7039800Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7039926Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7057015Z Entering 'third_party/onnx' 2025-12-04T12:53:16.7069978Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7070103Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7095744Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:16.7108626Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7108751Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7129316Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:16.7145738Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7146198Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7165787Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:16.7179264Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7179387Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7197652Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:16.7213716Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7213857Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7232254Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:16.7247980Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7248200Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7268596Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:16.7285740Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7285885Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7306199Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:16.7320974Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7321121Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7338326Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:16.7351318Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7351443Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7370496Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:16.7385293Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7385417Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7403275Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:16.7418491Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7418616Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7436751Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:16.7449318Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7449439Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7466651Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:16.7487205Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7487328Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7515015Z Entering 'third_party/pocketfft' 2025-12-04T12:53:16.7531293Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7531420Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7549656Z Entering 'third_party/protobuf' 2025-12-04T12:53:16.7562809Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7562935Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7581044Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:16.7596233Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7596371Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7619539Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:16.7634634Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7634894Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7653954Z Entering 'third_party/psimd' 2025-12-04T12:53:16.7669986Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7670134Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7687240Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:16.7702913Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7703056Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7721492Z Entering 'third_party/pybind11' 2025-12-04T12:53:16.7737022Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7737166Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7757627Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:16.7772304Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7772447Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7791922Z Entering 'third_party/sleef' 2025-12-04T12:53:16.7806754Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7806900Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7826543Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:16.7840143Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7840343Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7857916Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:16.7873853Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7893987Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7894151Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:16.7908039Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7908190Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7927136Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:16.7941702Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7941830Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7960323Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:16.7974619Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7974741Z url.https://github.com/.insteadof 2025-12-04T12:53:16.7991661Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:16.8006462Z url.https://github.com/.insteadof 2025-12-04T12:53:16.8006587Z url.https://github.com/.insteadof 2025-12-04T12:53:16.8042453Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T12:53:16.8227039Z Entering 'android/libs/fbjni' 2025-12-04T12:53:16.8250562Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T12:53:16.8260053Z Entering 'third_party/FP16' 2025-12-04T12:53:16.8283229Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T12:53:16.8293108Z Entering 'third_party/FXdiv' 2025-12-04T12:53:16.8316062Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T12:53:16.8325260Z Entering 'third_party/NNPACK' 2025-12-04T12:53:16.8347920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T12:53:16.8357934Z Entering 'third_party/NVTX' 2025-12-04T12:53:16.8377460Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T12:53:16.8388296Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:16.8407262Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T12:53:16.8416722Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:16.8441475Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T12:53:16.8459559Z Entering 'third_party/aiter' 2025-12-04T12:53:16.8485089Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T12:53:16.8495877Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:16.8517548Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T12:53:16.8531937Z Entering 'third_party/benchmark' 2025-12-04T12:53:16.8559840Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:16.8570656Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:16.8591080Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T12:53:16.8607111Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:16.8629832Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T12:53:16.8641280Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:16.8663537Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T12:53:16.8673284Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:16.8693749Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T12:53:16.8703688Z Entering 'third_party/cutlass' 2025-12-04T12:53:16.8727258Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T12:53:16.8741182Z Entering 'third_party/fbgemm' 2025-12-04T12:53:16.8761072Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T12:53:16.8772658Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:16.8797413Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T12:53:16.8808967Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:16.8830247Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T12:53:16.8844748Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:16.8865125Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T12:53:16.8878151Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:16.8896702Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T12:53:16.8908946Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:16.8929267Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T12:53:16.8938457Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:16.8958141Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T12:53:16.8967692Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:16.8988285Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T12:53:16.9000761Z Entering 'third_party/flash-attention' 2025-12-04T12:53:16.9021797Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T12:53:16.9031788Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:16.9060292Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T12:53:16.9075789Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:16.9094470Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T12:53:16.9109491Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:16.9128366Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T12:53:16.9140845Z Entering 'third_party/fmt' 2025-12-04T12:53:16.9162759Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:16.9173254Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:16.9193548Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T12:53:16.9207383Z Entering 'third_party/gloo' 2025-12-04T12:53:16.9228268Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T12:53:16.9238912Z Entering 'third_party/googletest' 2025-12-04T12:53:16.9259044Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:16.9270219Z Entering 'third_party/ideep' 2025-12-04T12:53:16.9289409Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T12:53:16.9299492Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:16.9328045Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T12:53:16.9345373Z Entering 'third_party/ittapi' 2025-12-04T12:53:16.9369489Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T12:53:16.9379870Z Entering 'third_party/kineto' 2025-12-04T12:53:16.9400692Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T12:53:16.9410892Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:16.9430791Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T12:53:16.9440435Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:16.9461934Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T12:53:16.9472487Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:16.9492529Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T12:53:16.9504742Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:16.9532968Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:53:16.9542506Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:16.9565464Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T12:53:16.9577953Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:16.9603816Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T12:53:16.9615194Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:16.9637528Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T12:53:16.9647749Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:16.9667505Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:16.9677015Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:16.9697317Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T12:53:16.9707846Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:16.9726807Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T12:53:16.9744005Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:16.9766918Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:16.9776602Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:16.9803280Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:16.9814919Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:16.9835047Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:16.9849338Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:16.9870052Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T12:53:16.9879653Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:16.9898288Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T12:53:16.9915939Z Entering 'third_party/kleidiai' 2025-12-04T12:53:16.9936857Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T12:53:16.9952546Z Entering 'third_party/mimalloc' 2025-12-04T12:53:16.9973829Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T12:53:16.9983961Z Entering 'third_party/nlohmann' 2025-12-04T12:53:17.0004976Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T12:53:17.0020500Z Entering 'third_party/onnx' 2025-12-04T12:53:17.0039569Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T12:53:17.0062404Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:17.0084879Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:17.0098843Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:17.0122930Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T12:53:17.0134375Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:17.0153917Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:17.0168841Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:17.0189465Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:17.0199737Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:17.0224179Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T12:53:17.0234056Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:17.0253592Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T12:53:17.0270149Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:17.0290079Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T12:53:17.0299170Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:17.0320648Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T12:53:17.0333098Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:17.0352592Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:53:17.0362361Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:17.0383223Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:53:17.0393829Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:17.0419630Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:53:17.0433290Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:17.0453719Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T12:53:17.0473534Z Entering 'third_party/pocketfft' 2025-12-04T12:53:17.0498134Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T12:53:17.0508724Z Entering 'third_party/protobuf' 2025-12-04T12:53:17.0529937Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T12:53:17.0541036Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:17.0563839Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:53:17.0575944Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:17.0616939Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:17.0635312Z Entering 'third_party/psimd' 2025-12-04T12:53:17.0667386Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T12:53:17.0677672Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:17.0709706Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T12:53:17.0719972Z Entering 'third_party/pybind11' 2025-12-04T12:53:17.0741745Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:17.0751563Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:17.0771448Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T12:53:17.0781943Z Entering 'third_party/sleef' 2025-12-04T12:53:17.0802850Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T12:53:17.0812913Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:17.0839801Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T12:53:17.0848638Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:17.0878084Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:53:17.0887417Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:17.0910463Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T12:53:17.0921013Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:17.0946586Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T12:53:17.0955049Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:17.0978992Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:53:17.0988126Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:17.1006791Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T12:53:17.1270003Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T12:53:17.1445621Z Entering 'android/libs/fbjni' 2025-12-04T12:53:17.1466440Z Entering 'third_party/FP16' 2025-12-04T12:53:17.1489034Z Entering 'third_party/FXdiv' 2025-12-04T12:53:17.1512047Z Entering 'third_party/NNPACK' 2025-12-04T12:53:17.1532443Z Entering 'third_party/NVTX' 2025-12-04T12:53:17.1555872Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:17.1577653Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:17.1608498Z Entering 'third_party/aiter' 2025-12-04T12:53:17.1632847Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:17.1665202Z Entering 'third_party/benchmark' 2025-12-04T12:53:17.1692584Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:17.1718627Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:17.1740223Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:17.1761926Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:17.1784246Z Entering 'third_party/cutlass' 2025-12-04T12:53:17.1808442Z Entering 'third_party/fbgemm' 2025-12-04T12:53:17.1830763Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:17.1851028Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:17.1880760Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:17.1904319Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:17.1930403Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:17.1954168Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:17.1976853Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:17.2000849Z Entering 'third_party/flash-attention' 2025-12-04T12:53:17.2021501Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:17.2054264Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:17.2081980Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:17.2108862Z Entering 'third_party/fmt' 2025-12-04T12:53:17.2128162Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:17.2151059Z Entering 'third_party/gloo' 2025-12-04T12:53:17.2171228Z Entering 'third_party/googletest' 2025-12-04T12:53:17.2194518Z Entering 'third_party/ideep' 2025-12-04T12:53:17.2214097Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:17.2238386Z Entering 'third_party/ittapi' 2025-12-04T12:53:17.2261923Z Entering 'third_party/kineto' 2025-12-04T12:53:17.2282656Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:17.2314183Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:17.2337380Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:17.2360800Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:17.2384976Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:17.2403260Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:17.2427353Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:17.2446929Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:17.2466774Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:17.2491658Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:17.2515377Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:17.2540120Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:17.2561557Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:17.2592955Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:17.2612186Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:17.2634745Z Entering 'third_party/kleidiai' 2025-12-04T12:53:17.2656496Z Entering 'third_party/mimalloc' 2025-12-04T12:53:17.2682939Z Entering 'third_party/nlohmann' 2025-12-04T12:53:17.2706258Z Entering 'third_party/onnx' 2025-12-04T12:53:17.2740699Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:17.2762648Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:17.2785771Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:17.2807985Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:17.2828746Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:17.2847831Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:17.2879837Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:17.2898997Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:17.2917441Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:17.2935559Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:17.2956128Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:17.2979628Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:17.3010690Z Entering 'third_party/pocketfft' 2025-12-04T12:53:17.3032100Z Entering 'third_party/protobuf' 2025-12-04T12:53:17.3055195Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:17.3075091Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:17.3097844Z Entering 'third_party/psimd' 2025-12-04T12:53:17.3126466Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:17.3147043Z Entering 'third_party/pybind11' 2025-12-04T12:53:17.3165867Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:17.3185252Z Entering 'third_party/sleef' 2025-12-04T12:53:17.3215596Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:17.3239360Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:17.3257671Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:17.3276878Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:17.3297472Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:17.3317430Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:17.3363363Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T12:53:17.3539358Z Entering 'android/libs/fbjni' 2025-12-04T12:53:17.3571867Z Entering 'third_party/FP16' 2025-12-04T12:53:17.3597547Z Entering 'third_party/FXdiv' 2025-12-04T12:53:17.3619527Z Entering 'third_party/NNPACK' 2025-12-04T12:53:17.3639322Z Entering 'third_party/NVTX' 2025-12-04T12:53:17.3660560Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:53:17.3682823Z Entering 'third_party/XNNPACK' 2025-12-04T12:53:17.3712461Z Entering 'third_party/aiter' 2025-12-04T12:53:17.3743452Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:53:17.3772233Z Entering 'third_party/benchmark' 2025-12-04T12:53:17.3793566Z Entering 'third_party/composable_kernel' 2025-12-04T12:53:17.3818174Z Entering 'third_party/cpp-httplib' 2025-12-04T12:53:17.3837524Z Entering 'third_party/cpuinfo' 2025-12-04T12:53:17.3864802Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:53:17.3885676Z Entering 'third_party/cutlass' 2025-12-04T12:53:17.3909696Z Entering 'third_party/fbgemm' 2025-12-04T12:53:17.3931380Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:53:17.3952121Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:53:17.3984249Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:53:17.4006992Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:53:17.4038402Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:53:17.4061165Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:53:17.4081366Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:53:17.4105553Z Entering 'third_party/flash-attention' 2025-12-04T12:53:17.4126169Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:53:17.4146568Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:53:17.4172086Z Entering 'third_party/flatbuffers' 2025-12-04T12:53:17.4194103Z Entering 'third_party/fmt' 2025-12-04T12:53:17.4213507Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:53:17.4240916Z Entering 'third_party/gloo' 2025-12-04T12:53:17.4262427Z Entering 'third_party/googletest' 2025-12-04T12:53:17.4282002Z Entering 'third_party/ideep' 2025-12-04T12:53:17.4301907Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:53:17.4328702Z Entering 'third_party/ittapi' 2025-12-04T12:53:17.4348965Z Entering 'third_party/kineto' 2025-12-04T12:53:17.4368512Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:53:17.4387979Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:53:17.4412121Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:53:17.4435608Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:53:17.4454974Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:53:17.4475830Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:53:17.4498524Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:53:17.4518373Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:53:17.4537331Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:53:17.4557055Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:53:17.4576282Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:53:17.4596431Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:17.4623906Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:17.4646312Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:53:17.4665190Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:53:17.4687354Z Entering 'third_party/kleidiai' 2025-12-04T12:53:17.4710445Z Entering 'third_party/mimalloc' 2025-12-04T12:53:17.4736851Z Entering 'third_party/nlohmann' 2025-12-04T12:53:17.4758008Z Entering 'third_party/onnx' 2025-12-04T12:53:17.4786159Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:53:17.4816557Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:53:17.4840114Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:53:17.4861344Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:53:17.4880313Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:53:17.4899914Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:53:17.4921501Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:53:17.4946932Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:53:17.4966220Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:53:17.4986097Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:53:17.5007831Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:53:17.5030075Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:53:17.5061443Z Entering 'third_party/pocketfft' 2025-12-04T12:53:17.5088215Z Entering 'third_party/protobuf' 2025-12-04T12:53:17.5110109Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:53:17.5134196Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:53:17.5157201Z Entering 'third_party/psimd' 2025-12-04T12:53:17.5176868Z Entering 'third_party/pthreadpool' 2025-12-04T12:53:17.5196800Z Entering 'third_party/pybind11' 2025-12-04T12:53:17.5220904Z Entering 'third_party/python-peachpy' 2025-12-04T12:53:17.5244497Z Entering 'third_party/sleef' 2025-12-04T12:53:17.5264688Z Entering 'third_party/tensorpipe' 2025-12-04T12:53:17.5287645Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:53:17.5308423Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:53:17.5327822Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:53:17.5350106Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:53:17.5372162Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:53:17.5406130Z ##[endgroup] 2025-12-04T12:53:17.5729561Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T12:53:17.5843739Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:17.5982074Z Prepare all required actions 2025-12-04T12:53:17.5982400Z Getting action download info 2025-12-04T12:53:17.8272025Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-12-04T12:53:18.5971355Z ##[group]Run ./.github/actions/setup-rocm 2025-12-04T12:53:18.5971487Z env: 2025-12-04T12:53:18.5971575Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.5971672Z ##[endgroup] 2025-12-04T12:53:18.5985278Z ##[group]Run dpkg -l | grep -E " rocm" 2025-12-04T12:53:18.5985413Z dpkg -l | grep -E " rocm" 2025-12-04T12:53:18.5990337Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.5990475Z env: 2025-12-04T12:53:18.5990555Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.5990652Z ##[endgroup] 2025-12-04T12:53:18.6047253Z ii rocm-cmake 0.14.0.60401-83~22.04 amd64 rocm-cmake built using CMake 2025-12-04T12:53:18.6047506Z ii rocm-core 6.4.1.60401-83~22.04 amd64 ROCm Runtime software stack 2025-12-04T12:53:18.6047722Z ii rocm-dbgapi 0.77.2.60401-83~22.04 amd64 Library to provide AMD GPU debugger API 2025-12-04T12:53:18.6047983Z ii rocm-debug-agent 2.0.4.60401-83~22.04 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent) 2025-12-04T12:53:18.6048228Z ii rocm-dev 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T12:53:18.6048459Z ii rocm-device-libs 1.0.0.60401-83~22.04 amd64 Radeon Open Compute - device libraries 2025-12-04T12:53:18.6048663Z ii rocm-gdb 15.2.60401-83~22.04 amd64 ROCgdb 2025-12-04T12:53:18.6048850Z ii rocm-llvm 19.0.0.25184.60401-83~22.04 amd64 ROCm core compiler 2025-12-04T12:53:18.6049054Z ii rocm-opencl 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T12:53:18.6049258Z ii rocm-opencl-dev 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T12:53:18.6049721Z ii rocm-smi-lib 7.5.0.60401-83~22.04 amd64 AMD System Management libraries 2025-12-04T12:53:18.6050113Z ii rocm-utils 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T12:53:18.6050408Z ii rocminfo 1.0.0.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool 2025-12-04T12:53:18.6071491Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T12:53:18.6071827Z # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T12:53:18.6072040Z # shellcheck disable=SC2046 2025-12-04T12:53:18.6072242Z docker stop $(docker ps -q) || true 2025-12-04T12:53:18.6072421Z # Prune all stopped containers. 2025-12-04T12:53:18.6072583Z docker container prune -f 2025-12-04T12:53:18.6077791Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.6078119Z env: 2025-12-04T12:53:18.6078231Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.6078371Z ##[endgroup] 2025-12-04T12:53:18.6325469Z docker: 'docker stop' requires at least 1 argument 2025-12-04T12:53:18.6325582Z 2025-12-04T12:53:18.6325649Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-12-04T12:53:18.6325746Z 2025-12-04T12:53:18.6326413Z See 'docker stop --help' for more information 2025-12-04T12:53:18.6441806Z Total reclaimed space: 0B 2025-12-04T12:53:18.6474349Z ##[group]Run cat /etc/os-release || true 2025-12-04T12:53:18.6474578Z cat /etc/os-release || true 2025-12-04T12:53:18.6474758Z cat /etc/apt/sources.list.d/rocm.list || true 2025-12-04T12:53:18.6475120Z cat /opt/rocm/.info/version || true 2025-12-04T12:53:18.6475267Z whoami 2025-12-04T12:53:18.6480620Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.6480802Z env: 2025-12-04T12:53:18.6480922Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.6481054Z ##[endgroup] 2025-12-04T12:53:18.6507441Z PRETTY_NAME="Ubuntu 22.04.5 LTS" 2025-12-04T12:53:18.6507747Z NAME="Ubuntu" 2025-12-04T12:53:18.6507930Z VERSION_ID="22.04" 2025-12-04T12:53:18.6508178Z VERSION="22.04.5 LTS (Jammy Jellyfish)" 2025-12-04T12:53:18.6508427Z VERSION_CODENAME=jammy 2025-12-04T12:53:18.6508622Z ID=ubuntu 2025-12-04T12:53:18.6508786Z ID_LIKE=debian 2025-12-04T12:53:18.6509008Z HOME_URL="https://www.ubuntu.com/" 2025-12-04T12:53:18.6509272Z SUPPORT_URL="https://help.ubuntu.com/" 2025-12-04T12:53:18.6509569Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" 2025-12-04T12:53:18.6509996Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" 2025-12-04T12:53:18.6510474Z UBUNTU_CODENAME=jammy 2025-12-04T12:53:18.6515217Z deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.4.1 jammy main 2025-12-04T12:53:18.6522079Z 6.4.1-83 2025-12-04T12:53:18.6529244Z runner 2025-12-04T12:53:18.6542416Z ##[group]Run dpkg -l | grep -E " amdgpu" 2025-12-04T12:53:18.6542582Z dpkg -l | grep -E " amdgpu" 2025-12-04T12:53:18.6545628Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.6545786Z env: 2025-12-04T12:53:18.6545879Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.6545990Z ##[endgroup] 2025-12-04T12:53:18.6596055Z ii amdgpu-core 1:6.4.60401-2164967.22.04 all Core meta package for unified amdgpu driver. 2025-12-04T12:53:18.6596525Z ii amdgpu-install 6.4.60401-2164967.22.04 all AMDGPU driver repository and installer 2025-12-04T12:53:18.6618335Z ##[group]Run rocm-smi 2025-12-04T12:53:18.6618534Z rocm-smi 2025-12-04T12:53:18.6623800Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.6623962Z env: 2025-12-04T12:53:18.6624066Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.6624179Z ##[endgroup] 2025-12-04T12:53:18.7256343Z 2025-12-04T12:53:18.7256473Z 2025-12-04T12:53:18.7256717Z ============================================ ROCm System Management Interface ============================================ 2025-12-04T12:53:18.7256955Z ====================================================== Concise Info ====================================================== 2025-12-04T12:53:18.7257203Z Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2025-12-04T12:53:18.7257830Z  (DID, GUID) (Junction) (Socket) (Mem, Compute, ID)  2025-12-04T12:53:18.7260456Z ========================================================================================================================== 2025-12-04T12:53:18.7261077Z 0 3 0x74a5, 51110 26.0°C 118.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T12:53:18.7261389Z 1 5 0x74a5, 2987 28.0°C 117.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T12:53:18.7261891Z 2 4 0x74a5, 61326 25.0°C 119.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T12:53:18.7262172Z 3 2 0x74a5, 9091 26.0°C 125.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T12:53:18.7262358Z ========================================================================================================================== 2025-12-04T12:53:18.7262535Z ================================================== End of ROCm SMI Log =================================================== 2025-12-04T12:53:18.7322934Z ##[group]Run rocminfo 2025-12-04T12:53:18.7323085Z rocminfo 2025-12-04T12:53:18.7327316Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.7327506Z env: 2025-12-04T12:53:18.7327628Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.7327763Z ##[endgroup] 2025-12-04T12:53:18.8212214Z ROCk module version 6.12.12 is loaded 2025-12-04T12:53:18.8212387Z ===================== 2025-12-04T12:53:18.8212504Z HSA System Attributes 2025-12-04T12:53:18.8212609Z ===================== 2025-12-04T12:53:18.8212722Z Runtime Version: 1.15 2025-12-04T12:53:18.8212835Z Runtime Ext Version: 1.7 2025-12-04T12:53:18.8212959Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T12:53:18.8213154Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T12:53:18.8213361Z Machine Model: LARGE 2025-12-04T12:53:18.8213528Z System Endianness: LITTLE 2025-12-04T12:53:18.8213676Z Mwaitx: DISABLED 2025-12-04T12:53:18.8213800Z XNACK enabled: NO 2025-12-04T12:53:18.8213918Z DMAbuf Support: YES 2025-12-04T12:53:18.8214021Z VMM Support: YES 2025-12-04T12:53:18.8214093Z 2025-12-04T12:53:18.8214129Z ========== 2025-12-04T12:53:18.8214237Z HSA Agents 2025-12-04T12:53:18.8214331Z ========== 2025-12-04T12:53:18.8214426Z ******* 2025-12-04T12:53:18.8214520Z Agent 1 2025-12-04T12:53:18.8214619Z ******* 2025-12-04T12:53:18.8214747Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:53:18.8214942Z Uuid: CPU-XX 2025-12-04T12:53:18.8215097Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:53:18.8215265Z Vendor Name: CPU 2025-12-04T12:53:18.8215419Z Feature: None specified 2025-12-04T12:53:18.8215573Z Profile: FULL_PROFILE 2025-12-04T12:53:18.8215731Z Float Round Mode: NEAR 2025-12-04T12:53:18.8215895Z Max Queue Number: 0(0x0) 2025-12-04T12:53:18.8216057Z Queue Min Size: 0(0x0) 2025-12-04T12:53:18.8216209Z Queue Max Size: 0(0x0) 2025-12-04T12:53:18.8216358Z Queue Type: MULTI 2025-12-04T12:53:18.8216503Z Node: 0 2025-12-04T12:53:18.8216737Z Device Type: CPU 2025-12-04T12:53:18.8216871Z Cache Info: 2025-12-04T12:53:18.8216991Z L1: 49152(0xc000) KB 2025-12-04T12:53:18.8217132Z Chip ID: 0(0x0) 2025-12-04T12:53:18.8217283Z ASIC Revision: 0(0x0) 2025-12-04T12:53:18.8217436Z Cacheline Size: 64(0x40) 2025-12-04T12:53:18.8217591Z Max Clock Freq. (MHz): 3300 2025-12-04T12:53:18.8217744Z BDFID: 0 2025-12-04T12:53:18.8218061Z Internal Node ID: 0 2025-12-04T12:53:18.8218221Z Compute Unit: 64 2025-12-04T12:53:18.8218373Z SIMDs per CU: 0 2025-12-04T12:53:18.8218523Z Shader Engines: 0 2025-12-04T12:53:18.8218688Z Shader Arrs. per Eng.: 0 2025-12-04T12:53:18.8218850Z WatchPts on Addr. Ranges:1 2025-12-04T12:53:18.8218994Z Memory Properties: 2025-12-04T12:53:18.8219104Z Features: None 2025-12-04T12:53:18.8219215Z Pool Info: 2025-12-04T12:53:18.8219452Z Pool 1 2025-12-04T12:53:18.8219589Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:53:18.8219742Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:53:18.8219899Z Allocatable: TRUE 2025-12-04T12:53:18.8220052Z Alloc Granule: 4KB 2025-12-04T12:53:18.8220256Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8220420Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8220574Z Accessible by all: TRUE 2025-12-04T12:53:18.8220733Z Pool 2 2025-12-04T12:53:18.8220869Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:53:18.8221025Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:53:18.8221173Z Allocatable: TRUE 2025-12-04T12:53:18.8221331Z Alloc Granule: 4KB 2025-12-04T12:53:18.8221490Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8221657Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8221816Z Accessible by all: TRUE 2025-12-04T12:53:18.8221949Z Pool 3 2025-12-04T12:53:18.8222081Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T12:53:18.8222231Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:53:18.8222374Z Allocatable: TRUE 2025-12-04T12:53:18.8222530Z Alloc Granule: 4KB 2025-12-04T12:53:18.8222696Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8222857Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8223015Z Accessible by all: TRUE 2025-12-04T12:53:18.8223148Z Pool 4 2025-12-04T12:53:18.8223275Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:53:18.8223426Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:53:18.8223568Z Allocatable: TRUE 2025-12-04T12:53:18.8223724Z Alloc Granule: 4KB 2025-12-04T12:53:18.8223887Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8224049Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8224210Z Accessible by all: TRUE 2025-12-04T12:53:18.8224343Z ISA Info: 2025-12-04T12:53:18.8224450Z ******* 2025-12-04T12:53:18.8224552Z Agent 2 2025-12-04T12:53:18.8224648Z ******* 2025-12-04T12:53:18.8224769Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:53:18.8224914Z Uuid: CPU-XX 2025-12-04T12:53:18.8225856Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:53:18.8226017Z Vendor Name: CPU 2025-12-04T12:53:18.8226166Z Feature: None specified 2025-12-04T12:53:18.8226315Z Profile: FULL_PROFILE 2025-12-04T12:53:18.8226470Z Float Round Mode: NEAR 2025-12-04T12:53:18.8226619Z Max Queue Number: 0(0x0) 2025-12-04T12:53:18.8226771Z Queue Min Size: 0(0x0) 2025-12-04T12:53:18.8226920Z Queue Max Size: 0(0x0) 2025-12-04T12:53:18.8227106Z Queue Type: MULTI 2025-12-04T12:53:18.8227249Z Node: 1 2025-12-04T12:53:18.8227386Z Device Type: CPU 2025-12-04T12:53:18.8227524Z Cache Info: 2025-12-04T12:53:18.8227640Z L1: 49152(0xc000) KB 2025-12-04T12:53:18.8227772Z Chip ID: 0(0x0) 2025-12-04T12:53:18.8227917Z ASIC Revision: 0(0x0) 2025-12-04T12:53:18.8228069Z Cacheline Size: 64(0x40) 2025-12-04T12:53:18.8228218Z Max Clock Freq. (MHz): 3300 2025-12-04T12:53:18.8228363Z BDFID: 0 2025-12-04T12:53:18.8228508Z Internal Node ID: 1 2025-12-04T12:53:18.8228660Z Compute Unit: 64 2025-12-04T12:53:18.8228814Z SIMDs per CU: 0 2025-12-04T12:53:18.8228961Z Shader Engines: 0 2025-12-04T12:53:18.8229117Z Shader Arrs. per Eng.: 0 2025-12-04T12:53:18.8229281Z WatchPts on Addr. Ranges:1 2025-12-04T12:53:18.8229418Z Memory Properties: 2025-12-04T12:53:18.8229528Z Features: None 2025-12-04T12:53:18.8229634Z Pool Info: 2025-12-04T12:53:18.8229738Z Pool 1 2025-12-04T12:53:18.8229868Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:53:18.8230014Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:53:18.8230162Z Allocatable: TRUE 2025-12-04T12:53:18.8230376Z Alloc Granule: 4KB 2025-12-04T12:53:18.8230539Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8230705Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8230866Z Accessible by all: TRUE 2025-12-04T12:53:18.8231001Z Pool 2 2025-12-04T12:53:18.8231131Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:53:18.8231275Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:53:18.8231423Z Allocatable: TRUE 2025-12-04T12:53:18.8231578Z Alloc Granule: 4KB 2025-12-04T12:53:18.8231735Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8231899Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8232058Z Accessible by all: TRUE 2025-12-04T12:53:18.8232193Z Pool 3 2025-12-04T12:53:18.8232323Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T12:53:18.8232467Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:53:18.8232659Z Allocatable: TRUE 2025-12-04T12:53:18.8232815Z Alloc Granule: 4KB 2025-12-04T12:53:18.8232974Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8233137Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8233296Z Accessible by all: TRUE 2025-12-04T12:53:18.8233428Z Pool 4 2025-12-04T12:53:18.8233556Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:53:18.8233699Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:53:18.8233898Z Allocatable: TRUE 2025-12-04T12:53:18.8234053Z Alloc Granule: 4KB 2025-12-04T12:53:18.8234210Z Alloc Recommended Granule:4KB 2025-12-04T12:53:18.8234374Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8234534Z Accessible by all: TRUE 2025-12-04T12:53:18.8234666Z ISA Info: 2025-12-04T12:53:18.8234767Z ******* 2025-12-04T12:53:18.8234861Z Agent 3 2025-12-04T12:53:18.8234957Z ******* 2025-12-04T12:53:18.8235070Z Name: gfx942 2025-12-04T12:53:18.8235209Z Uuid: GPU-73d95c9754364571 2025-12-04T12:53:18.8235361Z Marketing Name: AMD Instinct MI325X 2025-12-04T12:53:18.8235519Z Vendor Name: AMD 2025-12-04T12:53:18.8235672Z Feature: KERNEL_DISPATCH 2025-12-04T12:53:18.8235823Z Profile: BASE_PROFILE 2025-12-04T12:53:18.8235972Z Float Round Mode: NEAR 2025-12-04T12:53:18.8236132Z Max Queue Number: 128(0x80) 2025-12-04T12:53:18.8236285Z Queue Min Size: 64(0x40) 2025-12-04T12:53:18.8236432Z Queue Max Size: 131072(0x20000) 2025-12-04T12:53:18.8236582Z Queue Type: MULTI 2025-12-04T12:53:18.8236724Z Node: 2 2025-12-04T12:53:18.8236862Z Device Type: GPU 2025-12-04T12:53:18.8236996Z Cache Info: 2025-12-04T12:53:18.8237107Z L1: 32(0x20) KB 2025-12-04T12:53:18.8237243Z L2: 4096(0x1000) KB 2025-12-04T12:53:18.8237374Z L3: 262144(0x40000) KB 2025-12-04T12:53:18.8237504Z Chip ID: 29861(0x74a5) 2025-12-04T12:53:18.8237652Z ASIC Revision: 1(0x1) 2025-12-04T12:53:18.8237808Z Cacheline Size: 128(0x80) 2025-12-04T12:53:18.8237959Z Max Clock Freq. (MHz): 2100 2025-12-04T12:53:18.8238108Z BDFID: 29952 2025-12-04T12:53:18.8238259Z Internal Node ID: 2 2025-12-04T12:53:18.8238405Z Compute Unit: 304 2025-12-04T12:53:18.8238552Z SIMDs per CU: 4 2025-12-04T12:53:18.8238699Z Shader Engines: 32 2025-12-04T12:53:18.8238854Z Shader Arrs. per Eng.: 1 2025-12-04T12:53:18.8239014Z WatchPts on Addr. Ranges:4 2025-12-04T12:53:18.8239171Z Coherent Host Access: FALSE 2025-12-04T12:53:18.8239311Z Memory Properties: 2025-12-04T12:53:18.8239462Z Features: KERNEL_DISPATCH 2025-12-04T12:53:18.8239603Z Fast F16 Operation: TRUE 2025-12-04T12:53:18.8239760Z Wavefront Size: 64(0x40) 2025-12-04T12:53:18.8239914Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8240059Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8240223Z x 1024(0x400) 2025-12-04T12:53:18.8240351Z y 1024(0x400) 2025-12-04T12:53:18.8240478Z z 1024(0x400) 2025-12-04T12:53:18.8240621Z Max Waves Per CU: 32(0x20) 2025-12-04T12:53:18.8240811Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:53:18.8240971Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8241103Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8241225Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8241355Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8241481Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8241627Z Max fbarriers/Workgrp: 32 2025-12-04T12:53:18.8246975Z Packet Processor uCode:: 185 2025-12-04T12:53:18.8247144Z SDMA engine uCode:: 24 2025-12-04T12:53:18.8247302Z IOMMU Support:: None 2025-12-04T12:53:18.8247439Z Pool Info: 2025-12-04T12:53:18.8247541Z Pool 1 2025-12-04T12:53:18.8247679Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:53:18.8247828Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8247982Z Allocatable: TRUE 2025-12-04T12:53:18.8248147Z Alloc Granule: 4KB 2025-12-04T12:53:18.8248308Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8248475Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8248637Z Accessible by all: FALSE 2025-12-04T12:53:18.8248772Z Pool 2 2025-12-04T12:53:18.8248904Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:53:18.8249051Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8249201Z Allocatable: TRUE 2025-12-04T12:53:18.8249357Z Alloc Granule: 4KB 2025-12-04T12:53:18.8249516Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8249679Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8249841Z Accessible by all: FALSE 2025-12-04T12:53:18.8249974Z Pool 3 2025-12-04T12:53:18.8250103Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:53:18.8250311Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8250459Z Allocatable: TRUE 2025-12-04T12:53:18.8250617Z Alloc Granule: 4KB 2025-12-04T12:53:18.8250776Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8250945Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8251109Z Accessible by all: FALSE 2025-12-04T12:53:18.8251242Z Pool 4 2025-12-04T12:53:18.8251365Z Segment: GROUP 2025-12-04T12:53:18.8251602Z Size: 64(0x40) KB 2025-12-04T12:53:18.8251746Z Allocatable: FALSE 2025-12-04T12:53:18.8251901Z Alloc Granule: 0KB 2025-12-04T12:53:18.8252058Z Alloc Recommended Granule:0KB 2025-12-04T12:53:18.8252222Z Alloc Alignment: 0KB 2025-12-04T12:53:18.8252383Z Accessible by all: FALSE 2025-12-04T12:53:18.8252517Z ISA Info: 2025-12-04T12:53:18.8252625Z ISA 1 2025-12-04T12:53:18.8252814Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:53:18.8252978Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8253140Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8253299Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8253468Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8253626Z Fast f16: TRUE 2025-12-04T12:53:18.8253774Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8253919Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8254056Z x 1024(0x400) 2025-12-04T12:53:18.8254184Z y 1024(0x400) 2025-12-04T12:53:18.8254312Z z 1024(0x400) 2025-12-04T12:53:18.8254456Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8254592Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8254712Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8254839Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8254974Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8255123Z FBarrier Max Size: 32 2025-12-04T12:53:18.8268818Z ISA 2 2025-12-04T12:53:18.8269016Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:53:18.8269195Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8269362Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8269525Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8269690Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8269850Z Fast f16: TRUE 2025-12-04T12:53:18.8270005Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8270153Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8270347Z x 1024(0x400) 2025-12-04T12:53:18.8270476Z y 1024(0x400) 2025-12-04T12:53:18.8270603Z z 1024(0x400) 2025-12-04T12:53:18.8270742Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8270879Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8271002Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8271134Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8271254Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8271395Z FBarrier Max Size: 32 2025-12-04T12:53:18.8271532Z ******* 2025-12-04T12:53:18.8271629Z Agent 4 2025-12-04T12:53:18.8271728Z ******* 2025-12-04T12:53:18.8271926Z Name: gfx942 2025-12-04T12:53:18.8272071Z Uuid: GPU-0f8baf68cff7012d 2025-12-04T12:53:18.8272230Z Marketing Name: AMD Instinct MI325X 2025-12-04T12:53:18.8272384Z Vendor Name: AMD 2025-12-04T12:53:18.8272532Z Feature: KERNEL_DISPATCH 2025-12-04T12:53:18.8272681Z Profile: BASE_PROFILE 2025-12-04T12:53:18.8272840Z Float Round Mode: NEAR 2025-12-04T12:53:18.8272995Z Max Queue Number: 128(0x80) 2025-12-04T12:53:18.8273180Z Queue Min Size: 64(0x40) 2025-12-04T12:53:18.8273329Z Queue Max Size: 131072(0x20000) 2025-12-04T12:53:18.8273475Z Queue Type: MULTI 2025-12-04T12:53:18.8273616Z Node: 3 2025-12-04T12:53:18.8273760Z Device Type: GPU 2025-12-04T12:53:18.8273892Z Cache Info: 2025-12-04T12:53:18.8274002Z L1: 32(0x20) KB 2025-12-04T12:53:18.8274135Z L2: 4096(0x1000) KB 2025-12-04T12:53:18.8274261Z L3: 262144(0x40000) KB 2025-12-04T12:53:18.8274391Z Chip ID: 29861(0x74a5) 2025-12-04T12:53:18.8274539Z ASIC Revision: 1(0x1) 2025-12-04T12:53:18.8274691Z Cacheline Size: 128(0x80) 2025-12-04T12:53:18.8274842Z Max Clock Freq. (MHz): 2100 2025-12-04T12:53:18.8274984Z BDFID: 1280 2025-12-04T12:53:18.8275128Z Internal Node ID: 3 2025-12-04T12:53:18.8275288Z Compute Unit: 304 2025-12-04T12:53:18.8275432Z SIMDs per CU: 4 2025-12-04T12:53:18.8275580Z Shader Engines: 32 2025-12-04T12:53:18.8275733Z Shader Arrs. per Eng.: 1 2025-12-04T12:53:18.8275886Z WatchPts on Addr. Ranges:4 2025-12-04T12:53:18.8276044Z Coherent Host Access: FALSE 2025-12-04T12:53:18.8276187Z Memory Properties: 2025-12-04T12:53:18.8276304Z Features: KERNEL_DISPATCH 2025-12-04T12:53:18.8276457Z Fast F16 Operation: TRUE 2025-12-04T12:53:18.8276611Z Wavefront Size: 64(0x40) 2025-12-04T12:53:18.8276763Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8276915Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8277036Z x 1024(0x400) 2025-12-04T12:53:18.8277165Z y 1024(0x400) 2025-12-04T12:53:18.8277290Z z 1024(0x400) 2025-12-04T12:53:18.8277432Z Max Waves Per CU: 32(0x20) 2025-12-04T12:53:18.8277591Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:53:18.8277746Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8277890Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8278018Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8278152Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8278290Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8278450Z Max fbarriers/Workgrp: 32 2025-12-04T12:53:18.8278655Z Packet Processor uCode:: 185 2025-12-04T12:53:18.8278827Z SDMA engine uCode:: 24 2025-12-04T12:53:18.8278992Z IOMMU Support:: None 2025-12-04T12:53:18.8279130Z Pool Info: 2025-12-04T12:53:18.8279244Z Pool 1 2025-12-04T12:53:18.8279378Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:53:18.8279542Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8279706Z Allocatable: TRUE 2025-12-04T12:53:18.8279869Z Alloc Granule: 4KB 2025-12-04T12:53:18.8280074Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8280283Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8280446Z Accessible by all: FALSE 2025-12-04T12:53:18.8280598Z Pool 2 2025-12-04T12:53:18.8280733Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:53:18.8280893Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8281052Z Allocatable: TRUE 2025-12-04T12:53:18.8281213Z Alloc Granule: 4KB 2025-12-04T12:53:18.8281385Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8281557Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8281718Z Accessible by all: FALSE 2025-12-04T12:53:18.8281868Z Pool 3 2025-12-04T12:53:18.8282006Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:53:18.8282155Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8282315Z Allocatable: TRUE 2025-12-04T12:53:18.8282474Z Alloc Granule: 4KB 2025-12-04T12:53:18.8282644Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8282818Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8282979Z Accessible by all: FALSE 2025-12-04T12:53:18.8283128Z Pool 4 2025-12-04T12:53:18.8283262Z Segment: GROUP 2025-12-04T12:53:18.8283407Z Size: 64(0x40) KB 2025-12-04T12:53:18.8283566Z Allocatable: FALSE 2025-12-04T12:53:18.8283725Z Alloc Granule: 0KB 2025-12-04T12:53:18.8283900Z Alloc Recommended Granule:0KB 2025-12-04T12:53:18.8284073Z Alloc Alignment: 0KB 2025-12-04T12:53:18.8284232Z Accessible by all: FALSE 2025-12-04T12:53:18.8284379Z ISA Info: 2025-12-04T12:53:18.8284492Z ISA 1 2025-12-04T12:53:18.8284628Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:53:18.8284801Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8284965Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8285136Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8285313Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8285469Z Fast f16: TRUE 2025-12-04T12:53:18.8285630Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8285822Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8285949Z x 1024(0x400) 2025-12-04T12:53:18.8286086Z y 1024(0x400) 2025-12-04T12:53:18.8286224Z z 1024(0x400) 2025-12-04T12:53:18.8286369Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8286518Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8286641Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8286782Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8286950Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8287098Z FBarrier Max Size: 32 2025-12-04T12:53:18.8287236Z ISA 2 2025-12-04T12:53:18.8287378Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:53:18.8287554Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8287712Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8287864Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8288020Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8288168Z Fast f16: TRUE 2025-12-04T12:53:18.8288319Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8288461Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8288583Z x 1024(0x400) 2025-12-04T12:53:18.8288708Z y 1024(0x400) 2025-12-04T12:53:18.8288837Z z 1024(0x400) 2025-12-04T12:53:18.8288977Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8289118Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8289237Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8289366Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8289493Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8289636Z FBarrier Max Size: 32 2025-12-04T12:53:18.8289766Z ******* 2025-12-04T12:53:18.8289861Z Agent 5 2025-12-04T12:53:18.8289957Z ******* 2025-12-04T12:53:18.8290066Z Name: gfx942 2025-12-04T12:53:18.8290245Z Uuid: GPU-990a2a287a45ff0c 2025-12-04T12:53:18.8290427Z Marketing Name: AMD Instinct MI325X 2025-12-04T12:53:18.8290581Z Vendor Name: AMD 2025-12-04T12:53:18.8290732Z Feature: KERNEL_DISPATCH 2025-12-04T12:53:18.8290880Z Profile: BASE_PROFILE 2025-12-04T12:53:18.8291032Z Float Round Mode: NEAR 2025-12-04T12:53:18.8291179Z Max Queue Number: 128(0x80) 2025-12-04T12:53:18.8291328Z Queue Min Size: 64(0x40) 2025-12-04T12:53:18.8291474Z Queue Max Size: 131072(0x20000) 2025-12-04T12:53:18.8291617Z Queue Type: MULTI 2025-12-04T12:53:18.8291756Z Node: 4 2025-12-04T12:53:18.8291894Z Device Type: GPU 2025-12-04T12:53:18.8292023Z Cache Info: 2025-12-04T12:53:18.8292134Z L1: 32(0x20) KB 2025-12-04T12:53:18.8292262Z L2: 4096(0x1000) KB 2025-12-04T12:53:18.8292425Z L3: 262144(0x40000) KB 2025-12-04T12:53:18.8292555Z Chip ID: 29861(0x74a5) 2025-12-04T12:53:18.8292696Z ASIC Revision: 1(0x1) 2025-12-04T12:53:18.8292848Z Cacheline Size: 128(0x80) 2025-12-04T12:53:18.8292994Z Max Clock Freq. (MHz): 2100 2025-12-04T12:53:18.8293134Z BDFID: 25856 2025-12-04T12:53:18.8293280Z Internal Node ID: 4 2025-12-04T12:53:18.8293468Z Compute Unit: 304 2025-12-04T12:53:18.8293612Z SIMDs per CU: 4 2025-12-04T12:53:18.8293760Z Shader Engines: 32 2025-12-04T12:53:18.8293910Z Shader Arrs. per Eng.: 1 2025-12-04T12:53:18.8294071Z WatchPts on Addr. Ranges:4 2025-12-04T12:53:18.8294229Z Coherent Host Access: FALSE 2025-12-04T12:53:18.8294364Z Memory Properties: 2025-12-04T12:53:18.8294483Z Features: KERNEL_DISPATCH 2025-12-04T12:53:18.8294618Z Fast F16 Operation: TRUE 2025-12-04T12:53:18.8294773Z Wavefront Size: 64(0x40) 2025-12-04T12:53:18.8294926Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8295062Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8295190Z x 1024(0x400) 2025-12-04T12:53:18.8295317Z y 1024(0x400) 2025-12-04T12:53:18.8295436Z z 1024(0x400) 2025-12-04T12:53:18.8295571Z Max Waves Per CU: 32(0x20) 2025-12-04T12:53:18.8295724Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:53:18.8295881Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8296013Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8296121Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8296251Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8296375Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8296515Z Max fbarriers/Workgrp: 32 2025-12-04T12:53:18.8296674Z Packet Processor uCode:: 185 2025-12-04T12:53:18.8296833Z SDMA engine uCode:: 24 2025-12-04T12:53:18.8296985Z IOMMU Support:: None 2025-12-04T12:53:18.8297118Z Pool Info: 2025-12-04T12:53:18.8297227Z Pool 1 2025-12-04T12:53:18.8297354Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:53:18.8297499Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8297645Z Allocatable: TRUE 2025-12-04T12:53:18.8297800Z Alloc Granule: 4KB 2025-12-04T12:53:18.8297957Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8298115Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8298272Z Accessible by all: FALSE 2025-12-04T12:53:18.8298403Z Pool 2 2025-12-04T12:53:18.8298529Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:53:18.8298671Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8298814Z Allocatable: TRUE 2025-12-04T12:53:18.8298996Z Alloc Granule: 4KB 2025-12-04T12:53:18.8299153Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8299311Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8299467Z Accessible by all: FALSE 2025-12-04T12:53:18.8299599Z Pool 3 2025-12-04T12:53:18.8299723Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:53:18.8299862Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8300006Z Allocatable: TRUE 2025-12-04T12:53:18.8300228Z Alloc Granule: 4KB 2025-12-04T12:53:18.8300385Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8300545Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8300702Z Accessible by all: FALSE 2025-12-04T12:53:18.8300832Z Pool 4 2025-12-04T12:53:18.8300950Z Segment: GROUP 2025-12-04T12:53:18.8301088Z Size: 64(0x40) KB 2025-12-04T12:53:18.8301227Z Allocatable: FALSE 2025-12-04T12:53:18.8301378Z Alloc Granule: 0KB 2025-12-04T12:53:18.8301531Z Alloc Recommended Granule:0KB 2025-12-04T12:53:18.8301691Z Alloc Alignment: 0KB 2025-12-04T12:53:18.8301848Z Accessible by all: FALSE 2025-12-04T12:53:18.8301981Z ISA Info: 2025-12-04T12:53:18.8302080Z ISA 1 2025-12-04T12:53:18.8302203Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:53:18.8302365Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8302524Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8302677Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8302838Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8302985Z Fast f16: TRUE 2025-12-04T12:53:18.8303128Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8303268Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8303398Z x 1024(0x400) 2025-12-04T12:53:18.8303521Z y 1024(0x400) 2025-12-04T12:53:18.8303643Z z 1024(0x400) 2025-12-04T12:53:18.8303778Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8303916Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8304032Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8304157Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8304283Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8304424Z FBarrier Max Size: 32 2025-12-04T12:53:18.8304553Z ISA 2 2025-12-04T12:53:18.8304686Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:53:18.8304857Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8305018Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8305173Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8305329Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8305529Z Fast f16: TRUE 2025-12-04T12:53:18.8305676Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8305816Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8305941Z x 1024(0x400) 2025-12-04T12:53:18.8306067Z y 1024(0x400) 2025-12-04T12:53:18.8306186Z z 1024(0x400) 2025-12-04T12:53:18.8306322Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8306457Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8306608Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8306736Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8306941Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8307176Z FBarrier Max Size: 32 2025-12-04T12:53:18.8307381Z ******* 2025-12-04T12:53:18.8307539Z Agent 6 2025-12-04T12:53:18.8307634Z ******* 2025-12-04T12:53:18.8307741Z Name: gfx942 2025-12-04T12:53:18.8307882Z Uuid: GPU-99c7863ef295feac 2025-12-04T12:53:18.8308033Z Marketing Name: AMD Instinct MI325X 2025-12-04T12:53:18.8308187Z Vendor Name: AMD 2025-12-04T12:53:18.8308332Z Feature: KERNEL_DISPATCH 2025-12-04T12:53:18.8308494Z Profile: BASE_PROFILE 2025-12-04T12:53:18.8308642Z Float Round Mode: NEAR 2025-12-04T12:53:18.8308791Z Max Queue Number: 128(0x80) 2025-12-04T12:53:18.8308941Z Queue Min Size: 64(0x40) 2025-12-04T12:53:18.8309085Z Queue Max Size: 131072(0x20000) 2025-12-04T12:53:18.8309229Z Queue Type: MULTI 2025-12-04T12:53:18.8309370Z Node: 5 2025-12-04T12:53:18.8309506Z Device Type: GPU 2025-12-04T12:53:18.8309633Z Cache Info: 2025-12-04T12:53:18.8309741Z L1: 32(0x20) KB 2025-12-04T12:53:18.8309868Z L2: 4096(0x1000) KB 2025-12-04T12:53:18.8309996Z L3: 262144(0x40000) KB 2025-12-04T12:53:18.8310125Z Chip ID: 29861(0x74a5) 2025-12-04T12:53:18.8310328Z ASIC Revision: 1(0x1) 2025-12-04T12:53:18.8310475Z Cacheline Size: 128(0x80) 2025-12-04T12:53:18.8310623Z Max Clock Freq. (MHz): 2100 2025-12-04T12:53:18.8310764Z BDFID: 5376 2025-12-04T12:53:18.8310910Z Internal Node ID: 5 2025-12-04T12:53:18.8311053Z Compute Unit: 304 2025-12-04T12:53:18.8311196Z SIMDs per CU: 4 2025-12-04T12:53:18.8311342Z Shader Engines: 32 2025-12-04T12:53:18.8311489Z Shader Arrs. per Eng.: 1 2025-12-04T12:53:18.8311643Z WatchPts on Addr. Ranges:4 2025-12-04T12:53:18.8311797Z Coherent Host Access: FALSE 2025-12-04T12:53:18.8311933Z Memory Properties: 2025-12-04T12:53:18.8312051Z Features: KERNEL_DISPATCH 2025-12-04T12:53:18.8312228Z Fast F16 Operation: TRUE 2025-12-04T12:53:18.8312381Z Wavefront Size: 64(0x40) 2025-12-04T12:53:18.8312532Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8312667Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8312785Z x 1024(0x400) 2025-12-04T12:53:18.8312907Z y 1024(0x400) 2025-12-04T12:53:18.8313028Z z 1024(0x400) 2025-12-04T12:53:18.8313164Z Max Waves Per CU: 32(0x20) 2025-12-04T12:53:18.8313345Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:53:18.8313497Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8313628Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8313736Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8313865Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8313986Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8314128Z Max fbarriers/Workgrp: 32 2025-12-04T12:53:18.8314288Z Packet Processor uCode:: 185 2025-12-04T12:53:18.8314445Z SDMA engine uCode:: 24 2025-12-04T12:53:18.8314595Z IOMMU Support:: None 2025-12-04T12:53:18.8314729Z Pool Info: 2025-12-04T12:53:18.8314826Z Pool 1 2025-12-04T12:53:18.8314952Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:53:18.8315108Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8315250Z Allocatable: TRUE 2025-12-04T12:53:18.8315403Z Alloc Granule: 4KB 2025-12-04T12:53:18.8315565Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8315723Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8315877Z Accessible by all: FALSE 2025-12-04T12:53:18.8316005Z Pool 2 2025-12-04T12:53:18.8316130Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:53:18.8316275Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8316414Z Allocatable: TRUE 2025-12-04T12:53:18.8316563Z Alloc Granule: 4KB 2025-12-04T12:53:18.8316720Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8316875Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8317029Z Accessible by all: FALSE 2025-12-04T12:53:18.8317163Z Pool 3 2025-12-04T12:53:18.8317283Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:53:18.8317425Z Size: 268419072(0xfffc000) KB 2025-12-04T12:53:18.8317562Z Allocatable: TRUE 2025-12-04T12:53:18.8317711Z Alloc Granule: 4KB 2025-12-04T12:53:18.8317866Z Alloc Recommended Granule:2048KB 2025-12-04T12:53:18.8318020Z Alloc Alignment: 4KB 2025-12-04T12:53:18.8318173Z Accessible by all: FALSE 2025-12-04T12:53:18.8318304Z Pool 4 2025-12-04T12:53:18.8318420Z Segment: GROUP 2025-12-04T12:53:18.8318558Z Size: 64(0x40) KB 2025-12-04T12:53:18.8318736Z Allocatable: FALSE 2025-12-04T12:53:18.8318888Z Alloc Granule: 0KB 2025-12-04T12:53:18.8319043Z Alloc Recommended Granule:0KB 2025-12-04T12:53:18.8319199Z Alloc Alignment: 0KB 2025-12-04T12:53:18.8319351Z Accessible by all: FALSE 2025-12-04T12:53:18.8319485Z ISA Info: 2025-12-04T12:53:18.8319583Z ISA 1 2025-12-04T12:53:18.8319709Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:53:18.8319900Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8320057Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8320249Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8320409Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8320554Z Fast f16: TRUE 2025-12-04T12:53:18.8320701Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8320840Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8320963Z x 1024(0x400) 2025-12-04T12:53:18.8321086Z y 1024(0x400) 2025-12-04T12:53:18.8321363Z z 1024(0x400) 2025-12-04T12:53:18.8321501Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8321632Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8321755Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8321887Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8322010Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8322156Z FBarrier Max Size: 32 2025-12-04T12:53:18.8322284Z ISA 2 2025-12-04T12:53:18.8322415Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:53:18.8322581Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:53:18.8322735Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:53:18.8322889Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8323054Z Default Rounding Mode: NEAR 2025-12-04T12:53:18.8323207Z Fast f16: TRUE 2025-12-04T12:53:18.8323353Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:53:18.8323488Z Workgroup Max Size per Dimension: 2025-12-04T12:53:18.8323615Z x 1024(0x400) 2025-12-04T12:53:18.8323742Z y 1024(0x400) 2025-12-04T12:53:18.8323863Z z 1024(0x400) 2025-12-04T12:53:18.8323997Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:53:18.8324134Z Grid Max Size per Dimension: 2025-12-04T12:53:18.8324246Z x 4294967295(0xffffffff) 2025-12-04T12:53:18.8324372Z y 4294967295(0xffffffff) 2025-12-04T12:53:18.8324496Z z 4294967295(0xffffffff) 2025-12-04T12:53:18.8324638Z FBarrier Max Size: 32 2025-12-04T12:53:18.8324769Z *** Done *** 2025-12-04T12:53:18.8334909Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T12:53:18.8335102Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T12:53:18.8335432Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T12:53:18.8335697Z if [[ $ngpu -eq 0 ]]; then 2025-12-04T12:53:18.8335849Z  echo "Error: Failed to detect any GPUs on the runner" 2025-12-04T12:53:18.8335991Z  echo "$msg" 2025-12-04T12:53:18.8336100Z  exit 1 2025-12-04T12:53:18.8336197Z fi 2025-12-04T12:53:18.8339694Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.8339840Z env: 2025-12-04T12:53:18.8339928Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.8340034Z ##[endgroup] 2025-12-04T12:53:18.9342260Z ##[group]Run pytorch/pytorch/.github/actions/diskspace-cleanup@main 2025-12-04T12:53:18.9342460Z with: 2025-12-04T12:53:18.9342570Z diskspace-cutoff: 70 2025-12-04T12:53:18.9342682Z env: 2025-12-04T12:53:18.9342786Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.9342900Z ##[endgroup] 2025-12-04T12:53:18.9365481Z ##[group]Run set -ex 2025-12-04T12:53:18.9365630Z set -ex 2025-12-04T12:53:18.9365732Z diskspace_cutoff=70 2025-12-04T12:53:18.9365878Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-12-04T12:53:18.9366045Z if [ ! -d "$docker_root_dir" ]; then 2025-12-04T12:53:18.9366247Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-12-04T12:53:18.9366429Z  exit 0 2025-12-04T12:53:18.9366526Z fi 2025-12-04T12:53:18.9366692Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T12:53:18.9367029Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T12:53:18.9367311Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-12-04T12:53:18.9367456Z  docker system prune -af 2025-12-04T12:53:18.9367654Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T12:53:18.9367873Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-12-04T12:53:18.9368042Z  diskspace_cutoff_int=$((diskspace_cutoff + 0)) 2025-12-04T12:53:18.9368196Z  difference=$((100 - diskspace_cutoff_int)) 2025-12-04T12:53:18.9368411Z  echo "Error: Available diskspace is less than $difference percent. Not enough diskspace." 2025-12-04T12:53:18.9368606Z  echo "$msg" 2025-12-04T12:53:18.9368711Z  exit 1 2025-12-04T12:53:18.9368810Z  else 2025-12-04T12:53:18.9368925Z  difference=$((diskspace - diskspace_new)) 2025-12-04T12:53:18.9369084Z  echo "Diskspace saved: $difference percent" 2025-12-04T12:53:18.9369218Z  fi 2025-12-04T12:53:18.9369303Z fi 2025-12-04T12:53:18.9374107Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.9374254Z env: 2025-12-04T12:53:18.9374342Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.9374449Z ##[endgroup] 2025-12-04T12:53:18.9390271Z + diskspace_cutoff=70 2025-12-04T12:53:18.9393804Z ++ docker info -f '{{.DockerRootDir}}' 2025-12-04T12:53:18.9689832Z + docker_root_dir=/home/runner/docker-data 2025-12-04T12:53:18.9690229Z + '[' '!' -d /home/runner/docker-data ']' 2025-12-04T12:53:18.9697242Z ++ df -H --output=pcent /home/runner/docker-data 2025-12-04T12:53:18.9697744Z ++ sed -n 2p 2025-12-04T12:53:18.9697969Z ++ sed s/%// 2025-12-04T12:53:18.9698267Z ++ sed 's/ //' 2025-12-04T12:53:18.9713949Z + diskspace=' 3' 2025-12-04T12:53:18.9714507Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-12-04T12:53:18.9714989Z + [[ 3 -ge 70 ]] 2025-12-04T12:53:18.9735868Z ##[group]Run RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T12:53:18.9736245Z RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T12:53:18.9736402Z rm -rf "${RUNNER_ARTIFACT_DIR}" 2025-12-04T12:53:18.9736544Z mkdir -p "${RUNNER_ARTIFACT_DIR}" 2025-12-04T12:53:18.9736720Z echo "RUNNER_ARTIFACT_DIR=${RUNNER_ARTIFACT_DIR}" >> "${GITHUB_ENV}" 2025-12-04T12:53:18.9736883Z  2025-12-04T12:53:18.9737006Z RUNNER_TEST_RESULTS_DIR="${RUNNER_TEMP}/test-results" 2025-12-04T12:53:18.9737166Z rm -rf "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T12:53:18.9737298Z mkdir -p "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T12:53:18.9737481Z echo "RUNNER_TEST_RESULTS_DIR=${RUNNER_TEST_RESULTS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T12:53:18.9737649Z  2025-12-04T12:53:18.9737908Z RUNNER_DOCS_DIR="${RUNNER_TEMP}/docs" 2025-12-04T12:53:18.9738040Z rm -rf "${RUNNER_DOCS_DIR}" 2025-12-04T12:53:18.9738164Z mkdir -p "${RUNNER_DOCS_DIR}" 2025-12-04T12:53:18.9738330Z echo "RUNNER_DOCS_DIR=${RUNNER_DOCS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T12:53:18.9742898Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.9743044Z env: 2025-12-04T12:53:18.9743139Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.9743243Z ##[endgroup] 2025-12-04T12:53:18.9822448Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:53:18.9822837Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:53:18.9823096Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:53:18.9826698Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.9826880Z env: 2025-12-04T12:53:18.9827014Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.9827187Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:18.9827405Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:18.9827620Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:18.9827800Z ##[endgroup] 2025-12-04T12:53:18.9866425Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T12:53:18.9866702Z # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T12:53:18.9867200Z # Add render group for container creation. 2025-12-04T12:53:18.9867365Z render_gid=`cat /etc/group | grep render | cut -d: -f3` 2025-12-04T12:53:18.9867574Z # Ensure GPU isolation if pod is part of kubernetes setup with DEVICE_FLAG. 2025-12-04T12:53:18.9867773Z if [ -f "/etc/podinfo/gha-render-devices" ]; then 2025-12-04T12:53:18.9867943Z  DEVICE_FLAG=$(cat /etc/podinfo/gha-render-devices) 2025-12-04T12:53:18.9868083Z else 2025-12-04T12:53:18.9868188Z  DEVICE_FLAG="--device /dev/dri" 2025-12-04T12:53:18.9868301Z fi 2025-12-04T12:53:18.9868483Z # The --group-add daemon and --group-add bin are needed in the Ubuntu 24.04 and Almalinux OSs respectively. 2025-12-04T12:53:18.9868765Z # This is due to the device files (/dev/kfd & /dev/dri) being owned by video group on bare metal. 2025-12-04T12:53:18.9869018Z # This video group ID maps to subgid 1 inside the docker image due to the /etc/subgid entries. 2025-12-04T12:53:18.9869285Z # The group name corresponding to group ID 1 can change depending on the OS, so both are necessary. 2025-12-04T12:53:18.9869736Z echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd $DEVICE_FLAG --group-add video --group-add $render_gid --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host" >> "${GITHUB_ENV}" 2025-12-04T12:53:18.9872524Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:18.9872698Z env: 2025-12-04T12:53:18.9889171Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.9889312Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:18.9889621Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:18.9889783Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:18.9889914Z ##[endgroup] 2025-12-04T12:53:18.9956074Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-12-04T12:53:18.9956278Z with: 2025-12-04T12:53:18.9956424Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only 2025-12-04T12:53:18.9956597Z aws-region: us-east-1 2025-12-04T12:53:18.9956711Z role-duration-seconds: 18000 2025-12-04T12:53:18.9956829Z audience: sts.amazonaws.com 2025-12-04T12:53:18.9956939Z env: 2025-12-04T12:53:18.9957022Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:18.9957266Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:18.9957437Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:18.9957599Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:18.9958121Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:18.9958606Z ##[endgroup] 2025-12-04T12:53:19.2883331Z Assuming role with OIDC 2025-12-04T12:53:19.6371636Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions 2025-12-04T12:53:19.7345277Z ##[group]Run aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 2025-12-04T12:53:19.7345482Z with: 2025-12-04T12:53:19.7345588Z mask-password: true 2025-12-04T12:53:19.7345718Z registry-type: private 2025-12-04T12:53:19.7345831Z skip-logout: false 2025-12-04T12:53:19.7345934Z env: 2025-12-04T12:53:19.7346030Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:19.7346165Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:19.7346357Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:19.7346524Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:19.7347037Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:19.7347537Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:19.7347657Z AWS_REGION: us-east-1 2025-12-04T12:53:19.7348079Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:19.7348270Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:19.7350634Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:19.7350744Z ##[endgroup] 2025-12-04T12:53:20.1516501Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:20.7658839Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:53:20.7659216Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:53:20.7659505Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:53:20.7659785Z env | grep '^RUNNER' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:53:20.7665142Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:20.7665315Z env: 2025-12-04T12:53:20.7665426Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:20.7665588Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:20.7665796Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:20.7665990Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:20.7666582Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:20.7667295Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:20.7667432Z AWS_REGION: us-east-1 2025-12-04T12:53:20.7667676Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:20.7667861Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:20.7670586Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:20.7670712Z ##[endgroup] 2025-12-04T12:53:20.7770295Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T12:53:20.7770525Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T12:53:20.7770785Z if [[ $ngpu -lt 2 ]]; then #We are temporarily reducing this down to 2 from 4 so that we can run tests on nodes with less gpus. 2025-12-04T12:53:20.7771072Z  echo "Error: only $ngpu GPU(s) detected, at least 2 GPUs are needed for distributed jobs" 2025-12-04T12:53:20.7771254Z  exit 1 2025-12-04T12:53:20.7771344Z fi 2025-12-04T12:53:20.7775812Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:20.7775958Z env: 2025-12-04T12:53:20.7776052Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:20.7776184Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:20.7776358Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:20.7776523Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:20.7777036Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:20.7777521Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:20.7777638Z AWS_REGION: us-east-1 2025-12-04T12:53:20.7777843Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:20.7778003Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:20.7780161Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:20.7780446Z ##[endgroup] 2025-12-04T12:53:20.8873402Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-12-04T12:53:20.8873597Z with: 2025-12-04T12:53:20.8873876Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8874182Z use-custom-docker-registry: true 2025-12-04T12:53:20.8874313Z docker-build-dir: .ci/docker 2025-12-04T12:53:20.8874440Z docker-build-script: ./build.sh 2025-12-04T12:53:20.8874564Z working-directory: . 2025-12-04T12:53:20.8874710Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:20.8874869Z force-push: false 2025-12-04T12:53:20.8874966Z env: 2025-12-04T12:53:20.8875062Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:20.8875203Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:20.8875382Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:20.8875580Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:20.8876086Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:20.8876584Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:20.8876705Z AWS_REGION: us-east-1 2025-12-04T12:53:20.8876869Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:20.8877024Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:20.8879177Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:20.8879286Z ##[endgroup] 2025-12-04T12:53:20.8887634Z ##[group]Run set -ex 2025-12-04T12:53:20.8887764Z set -ex 2025-12-04T12:53:20.8887861Z  2025-12-04T12:53:20.8888020Z # If the docker build directory or the build script doesn't exist, the action will 2025-12-04T12:53:20.8888349Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-12-04T12:53:20.8888563Z # job could then download the pre-built image as usual 2025-12-04T12:53:20.8888822Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-12-04T12:53:20.8889068Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8889203Z else 2025-12-04T12:53:20.8889317Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8889490Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8889647Z  2025-12-04T12:53:20.8889856Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-12-04T12:53:20.8890086Z  exit 0 2025-12-04T12:53:20.8890245Z fi 2025-12-04T12:53:20.8890343Z  2025-12-04T12:53:20.8890487Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-12-04T12:53:20.8890716Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-12-04T12:53:20.8890921Z  # use it as it is, but first let's extract the tag 2025-12-04T12:53:20.8891108Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-12-04T12:53:20.8891303Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8891488Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8891646Z else 2025-12-04T12:53:20.8891762Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-12-04T12:53:20.8891916Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-12-04T12:53:20.8892070Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-12-04T12:53:20.8892206Z  fi 2025-12-04T12:53:20.8892456Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-12-04T12:53:20.8892685Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8892925Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8893183Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8893346Z fi 2025-12-04T12:53:20.8896174Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:20.8896319Z env: 2025-12-04T12:53:20.8896418Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:20.8896557Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:20.8896735Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:20.8896904Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:20.8897410Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:20.8897904Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:20.8898027Z AWS_REGION: us-east-1 2025-12-04T12:53:20.8898166Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:20.8898321Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:20.8900492Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:20.8900602Z REPO_NAME: pytorch 2025-12-04T12:53:20.8900877Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8901173Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T12:53:20.8901297Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-12-04T12:53:20.8901451Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:20.8901667Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-12-04T12:53:20.8901792Z CUSTOM_TAG_PREFIX: 2025-12-04T12:53:20.8901900Z ##[endgroup] 2025-12-04T12:53:20.8922738Z + [[ -d .ci/docker ]] 2025-12-04T12:53:20.8922889Z + [[ -f .ci/docker/./build.sh ]] 2025-12-04T12:53:20.8923021Z + [[ true == \t\r\u\e ]] 2025-12-04T12:53:20.8923132Z + echo skip=false 2025-12-04T12:53:20.8923682Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-12-04T12:53:20.8930588Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8932613Z ++ awk -F '[:,]' '{print $2}' 2025-12-04T12:53:20.8945046Z + DOCKER_TAG=pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8945517Z + echo docker-tag=pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8946073Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8974802Z ##[group]Run set +e 2025-12-04T12:53:20.8974929Z set +e 2025-12-04T12:53:20.8975021Z set -x 2025-12-04T12:53:20.8975111Z  2025-12-04T12:53:20.8975198Z login() { 2025-12-04T12:53:20.8975384Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T12:53:20.8975576Z } 2025-12-04T12:53:20.8975662Z  2025-12-04T12:53:20.8975749Z retry () { 2025-12-04T12:53:20.8975863Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T12:53:20.8975984Z } 2025-12-04T12:53:20.8976069Z  2025-12-04T12:53:20.8976160Z retry login "${DOCKER_REGISTRY}" 2025-12-04T12:53:20.8976287Z  2025-12-04T12:53:20.8976446Z START_TIME=$(date +%s) 2025-12-04T12:53:20.8976567Z # Wait up to 120 minutes 2025-12-04T12:53:20.8976715Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-12-04T12:53:20.8976898Z  # Check if image already exists, if it does then skip building it 2025-12-04T12:53:20.8977082Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-12-04T12:53:20.8977219Z  exit 0 2025-12-04T12:53:20.8977313Z  fi 2025-12-04T12:53:20.8977399Z  2025-12-04T12:53:20.8977544Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-12-04T12:53:20.8977790Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-12-04T12:53:20.8978025Z  # latter, it will wait for the Docker images to become available before continuing 2025-12-04T12:53:20.8978219Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-12-04T12:53:20.8978383Z  # It's a Docker build job, let's build the image 2025-12-04T12:53:20.8978513Z  break 2025-12-04T12:53:20.8978607Z  else 2025-12-04T12:53:20.8978739Z  # It's a regular build job, wait for the image to become available 2025-12-04T12:53:20.8978889Z  sleep 300 2025-12-04T12:53:20.8978987Z  fi 2025-12-04T12:53:20.8979075Z done 2025-12-04T12:53:20.8979160Z  2025-12-04T12:53:20.8979294Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-12-04T12:53:20.8979502Z # be empty. The default action would be to continue rebuild the image 2025-12-04T12:53:20.8979690Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-12-04T12:53:20.8979859Z  # if we're on the base branch then use the parent commit 2025-12-04T12:53:20.8980012Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-12-04T12:53:20.8980236Z else 2025-12-04T12:53:20.8980365Z  # otherwise we're on a PR, so use the most recent base commit 2025-12-04T12:53:20.8980545Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-12-04T12:53:20.8980680Z fi 2025-12-04T12:53:20.8980764Z  2025-12-04T12:53:20.8980859Z if [[ -z "${MERGE_BASE}" ]]; then 2025-12-04T12:53:20.8980997Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8981120Z  2025-12-04T12:53:20.8981290Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-12-04T12:53:20.8981483Z  exit 0 2025-12-04T12:53:20.8981573Z fi 2025-12-04T12:53:20.8981657Z  2025-12-04T12:53:20.8981776Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-12-04T12:53:20.8982022Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-12-04T12:53:20.8982241Z  exit 1 2025-12-04T12:53:20.8982329Z fi 2025-12-04T12:53:20.8982413Z  2025-12-04T12:53:20.8982553Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-12-04T12:53:20.8982790Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-12-04T12:53:20.8983004Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-12-04T12:53:20.8983244Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-12-04T12:53:20.8983516Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-12-04T12:53:20.8983681Z fi 2025-12-04T12:53:20.8983763Z  2025-12-04T12:53:20.8983867Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T12:53:20.8986406Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:20.8986598Z env: 2025-12-04T12:53:20.8986688Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:20.8986822Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:20.8986994Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:20.8987157Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:20.8987657Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:20.8988141Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:20.8988256Z AWS_REGION: us-east-1 2025-12-04T12:53:20.8988394Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:20.8988545Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:20.8990765Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:20.8990880Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T12:53:20.8991017Z BASE_REVISION: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:53:20.8991324Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8991672Z DOCKER_TAG: pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:20.8991895Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:20.8992042Z DOCKER_PUSH: 2025-12-04T12:53:20.8992134Z ##[endgroup] 2025-12-04T12:53:20.9009118Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:20.9009370Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:20.9011558Z + aws ecr get-login-password --region us-east-1 2025-12-04T12:53:20.9012082Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:20.9012513Z /home/runner/_work/_temp/089ad673-ec77-47cb-a3ab-1dbf25d4b6be.sh: line 5: aws: command not found 2025-12-04T12:53:20.9082190Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T12:53:20.9092190Z + sleep 1 2025-12-04T12:53:21.9102545Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:21.9106841Z + aws ecr get-login-password --region us-east-1 2025-12-04T12:53:21.9107510Z /home/runner/_work/_temp/089ad673-ec77-47cb-a3ab-1dbf25d4b6be.sh: line 5: aws: command not found 2025-12-04T12:53:21.9108266Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:21.9193678Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T12:53:21.9208633Z + sleep 2 2025-12-04T12:53:23.9224341Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:23.9227782Z + aws ecr get-login-password --region us-east-1 2025-12-04T12:53:23.9228325Z /home/runner/_work/_temp/089ad673-ec77-47cb-a3ab-1dbf25d4b6be.sh: line 5: aws: command not found 2025-12-04T12:53:23.9229612Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:23.9332249Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T12:53:23.9348161Z ++ date +%s 2025-12-04T12:53:23.9358519Z + START_TIME=1764852803 2025-12-04T12:53:23.9362939Z ++ date +%s 2025-12-04T12:53:23.9371869Z + [[ 1764845603 -lt 1764852803 ]] 2025-12-04T12:53:23.9372469Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:25.2601744Z { 2025-12-04T12:53:25.2601975Z "schemaVersion": 2, 2025-12-04T12:53:25.2602234Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-12-04T12:53:25.2602431Z "config": { 2025-12-04T12:53:25.2602573Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-12-04T12:53:25.2602733Z "size": 30520, 2025-12-04T12:53:25.2602913Z "digest": "sha256:45252333063339f104d56e41f20304e9511ab21c7768e8d156b95ddf24a9dbe5" 2025-12-04T12:53:25.2603682Z }, 2025-12-04T12:53:25.2603769Z "layers": [ 2025-12-04T12:53:25.2603864Z { 2025-12-04T12:53:25.2604131Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2604296Z "size": 30447951, 2025-12-04T12:53:25.2604464Z "digest": "sha256:63e5bc7682b85ae57a1221210f64d62e7a90b0a30f19af4ca734b8242ae49d63" 2025-12-04T12:53:25.2604640Z }, 2025-12-04T12:53:25.2604724Z { 2025-12-04T12:53:25.2604857Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2605016Z "size": 1554, 2025-12-04T12:53:25.2605177Z "digest": "sha256:835841cca3b7e1464290cdb78e48773e03583413fbed852c3cc5165a392ea44d" 2025-12-04T12:53:25.2605355Z }, 2025-12-04T12:53:25.2605440Z { 2025-12-04T12:53:25.2605573Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2605737Z "size": 313275691, 2025-12-04T12:53:25.2605911Z "digest": "sha256:aac69780afc8611a5f94a235792d39ae055249c8319ef43b78675998a9b2f825" 2025-12-04T12:53:25.2606088Z }, 2025-12-04T12:53:25.2606173Z { 2025-12-04T12:53:25.2606304Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2606464Z "size": 704, 2025-12-04T12:53:25.2606628Z "digest": "sha256:029495b23122c840ca0e52d487afa8d2c4dbf1991cd7f204ec3e434dcf947bf4" 2025-12-04T12:53:25.2606806Z }, 2025-12-04T12:53:25.2606890Z { 2025-12-04T12:53:25.2607016Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2607180Z "size": 1218, 2025-12-04T12:53:25.2607342Z "digest": "sha256:d0fb85b008332051a3f7c052721ef68bde404b46c23fa43ad040373bd367826c" 2025-12-04T12:53:25.2607518Z }, 2025-12-04T12:53:25.2607600Z { 2025-12-04T12:53:25.2607729Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2607889Z "size": 484, 2025-12-04T12:53:25.2608050Z "digest": "sha256:59b63930883363c7d2aaab27cc61555d9f3e119dc18247a8624c98ebdaa354a5" 2025-12-04T12:53:25.2608359Z }, 2025-12-04T12:53:25.2608447Z { 2025-12-04T12:53:25.2608577Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2608738Z "size": 110363202, 2025-12-04T12:53:25.2608906Z "digest": "sha256:dc112c89d57aa1e85082e40a56e5bc743d64f834ae2f98afe91f60c248354d38" 2025-12-04T12:53:25.2609084Z }, 2025-12-04T12:53:25.2609167Z { 2025-12-04T12:53:25.2609300Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2609461Z "size": 4436, 2025-12-04T12:53:25.2609621Z "digest": "sha256:522eab2402e5001810155ef7eb56940b7c01a4fef62ac588886981c3b8ee8e1e" 2025-12-04T12:53:25.2609795Z }, 2025-12-04T12:53:25.2609879Z { 2025-12-04T12:53:25.2610011Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2610254Z "size": 1755, 2025-12-04T12:53:25.2610409Z "digest": "sha256:2b5a11b41761d8ea3b829e4772e4064cb6c4e4989126af324d0057661e4493a1" 2025-12-04T12:53:25.2610587Z }, 2025-12-04T12:53:25.2610668Z { 2025-12-04T12:53:25.2610809Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2610965Z "size": 724, 2025-12-04T12:53:25.2611125Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T12:53:25.2611299Z }, 2025-12-04T12:53:25.2611381Z { 2025-12-04T12:53:25.2611516Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2611676Z "size": 3185588166, 2025-12-04T12:53:25.2611843Z "digest": "sha256:73e33534e9eb94cf29418d65944168962b65fe21f55e9b8bad18c76e9b3a37b8" 2025-12-04T12:53:25.2612017Z }, 2025-12-04T12:53:25.2612100Z { 2025-12-04T12:53:25.2612228Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2612386Z "size": 396, 2025-12-04T12:53:25.2612552Z "digest": "sha256:5bfdaeb5578d6ffcd7db29c48303cbceb13c591210feaa216a8daa7a6d445b4b" 2025-12-04T12:53:25.2612732Z }, 2025-12-04T12:53:25.2612821Z { 2025-12-04T12:53:25.2613006Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2613167Z "size": 236863, 2025-12-04T12:53:25.2613334Z "digest": "sha256:c07d27e4d3a5ba4ad5325bb785b2e4f058fe5e10ec1aeeb413a1e152b073f203" 2025-12-04T12:53:25.2613520Z }, 2025-12-04T12:53:25.2613598Z { 2025-12-04T12:53:25.2613728Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2613887Z "size": 787, 2025-12-04T12:53:25.2614050Z "digest": "sha256:b21856d1bf420da6fa8ec7331b82ab355d4f4178644e7d3a3d3d0fbc3610109a" 2025-12-04T12:53:25.2614231Z }, 2025-12-04T12:53:25.2614313Z { 2025-12-04T12:53:25.2614442Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2614600Z "size": 106, 2025-12-04T12:53:25.2614765Z "digest": "sha256:cb19d84867e4063f55db9459c28c50a2abc37c06d3c1ca82ba95fa8427cc438a" 2025-12-04T12:53:25.2614943Z }, 2025-12-04T12:53:25.2615028Z { 2025-12-04T12:53:25.2615164Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2615388Z "size": 1496, 2025-12-04T12:53:25.2615903Z "digest": "sha256:8165374f8dccf88a7791a5d31afbe29e4d4542b4f1cf1904945e07f9af6bf8ba" 2025-12-04T12:53:25.2616086Z }, 2025-12-04T12:53:25.2616171Z { 2025-12-04T12:53:25.2616305Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2616468Z "size": 458789560, 2025-12-04T12:53:25.2616650Z "digest": "sha256:1aecc77354ceba59ec6f0d37a558f2dbb6d5c0854553ee8505ac8707b422da6d" 2025-12-04T12:53:25.2616829Z }, 2025-12-04T12:53:25.2616912Z { 2025-12-04T12:53:25.2617042Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2617235Z "size": 164, 2025-12-04T12:53:25.2617400Z "digest": "sha256:465d3fd643aa2ea0ad07335cda66f12f1d7e5e800c4e9385ec466bc8a1ceabda" 2025-12-04T12:53:25.2617580Z }, 2025-12-04T12:53:25.2617662Z { 2025-12-04T12:53:25.2617791Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2618008Z "size": 104, 2025-12-04T12:53:25.2618169Z "digest": "sha256:6c503e779d6f41ca7f51309875df2b725c171926aece7009c4b8a64d1ba3f58e" 2025-12-04T12:53:25.2618346Z }, 2025-12-04T12:53:25.2618425Z { 2025-12-04T12:53:25.2618555Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2618712Z "size": 724, 2025-12-04T12:53:25.2618869Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T12:53:25.2619043Z }, 2025-12-04T12:53:25.2619124Z { 2025-12-04T12:53:25.2619254Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2619411Z "size": 196, 2025-12-04T12:53:25.2619571Z "digest": "sha256:f7e9a021f0ee3d11a50dcb96378af8103a21f6c3c142f54529207648f3ed00b2" 2025-12-04T12:53:25.2619750Z }, 2025-12-04T12:53:25.2619834Z { 2025-12-04T12:53:25.2619961Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2620124Z "size": 2583, 2025-12-04T12:53:25.2620328Z "digest": "sha256:8e023b349080fb11ee55491bc9b842b30e9e3a90246d05b303a73dc62038caf2" 2025-12-04T12:53:25.2620507Z }, 2025-12-04T12:53:25.2620589Z { 2025-12-04T12:53:25.2620720Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2620881Z "size": 7577171420, 2025-12-04T12:53:25.2621047Z "digest": "sha256:8188df80e595a3dbcf84623c6a58a655269898cbb60029435f136d7f9d34ccaa" 2025-12-04T12:53:25.2621225Z }, 2025-12-04T12:53:25.2621309Z { 2025-12-04T12:53:25.2621433Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2621590Z "size": 135, 2025-12-04T12:53:25.2621753Z "digest": "sha256:3c2c2f8c74bfa16c4bf9a832c97bbb1d55205b2b4a2cead02cf74301ca1001fb" 2025-12-04T12:53:25.2621935Z }, 2025-12-04T12:53:25.2622018Z { 2025-12-04T12:53:25.2622147Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2622304Z "size": 104, 2025-12-04T12:53:25.2622523Z "digest": "sha256:2aa7784fbe3300f8bbfb6bb51cff3b01fd091e829c2bc7ab9e25261a0dd9b3bd" 2025-12-04T12:53:25.2622704Z }, 2025-12-04T12:53:25.2622786Z { 2025-12-04T12:53:25.2622915Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2623071Z "size": 612, 2025-12-04T12:53:25.2623229Z "digest": "sha256:2b3b5215d3ebe8789f0444457bfd5a6e218289b64aa07653ac3d03ddda5e6708" 2025-12-04T12:53:25.2623407Z }, 2025-12-04T12:53:25.2623488Z { 2025-12-04T12:53:25.2623615Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2623771Z "size": 838191945, 2025-12-04T12:53:25.2623940Z "digest": "sha256:99b1f1ea3e857834cebd01763d90fbd700aeb9c2d2ef23eda2cfff5652c9708b" 2025-12-04T12:53:25.2624119Z }, 2025-12-04T12:53:25.2624225Z { 2025-12-04T12:53:25.2624356Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2624511Z "size": 111, 2025-12-04T12:53:25.2624681Z "digest": "sha256:18d6daba0a5768a37ad106b57974f6b7efd35c43a87c246bcd3f43fea88f2d2b" 2025-12-04T12:53:25.2624863Z }, 2025-12-04T12:53:25.2624946Z { 2025-12-04T12:53:25.2625074Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2625231Z "size": 1555, 2025-12-04T12:53:25.2625392Z "digest": "sha256:5277f2a503ebd17ba9d9b86cc9bac86265504adeb449c0647616ddaacd3cbc41" 2025-12-04T12:53:25.2625572Z }, 2025-12-04T12:53:25.2625654Z { 2025-12-04T12:53:25.2625782Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2625938Z "size": 107, 2025-12-04T12:53:25.2626117Z "digest": "sha256:3198a9717aace920fd5de085319adf75091af05fc4318ce4b16a8a5b0e8d449e" 2025-12-04T12:53:25.2626292Z }, 2025-12-04T12:53:25.2626374Z { 2025-12-04T12:53:25.2626502Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2626657Z "size": 166, 2025-12-04T12:53:25.2626814Z "digest": "sha256:99a4918e5808277879449e97ccd7190db6b9aa2d742b57a3b831ce0198522bdd" 2025-12-04T12:53:25.2627024Z }, 2025-12-04T12:53:25.2627112Z { 2025-12-04T12:53:25.2627242Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2627402Z "size": 3526081, 2025-12-04T12:53:25.2627569Z "digest": "sha256:15bb11dfc6acc3537d527d6771c8e711e5605e99f82ec41e805d4600b8a97516" 2025-12-04T12:53:25.2627747Z }, 2025-12-04T12:53:25.2627830Z { 2025-12-04T12:53:25.2627958Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2628116Z "size": 107, 2025-12-04T12:53:25.2628278Z "digest": "sha256:bd87c8766e90e33db17514558ac591cc3f4149afd7abeaef4dd5770bbfa14210" 2025-12-04T12:53:25.2628457Z }, 2025-12-04T12:53:25.2628540Z { 2025-12-04T12:53:25.2628669Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2628825Z "size": 829, 2025-12-04T12:53:25.2628983Z "digest": "sha256:1969e15d0c13874ea5883ed829235a19ef6dc21c8aa6172032b78a8ffa6ff262" 2025-12-04T12:53:25.2629164Z }, 2025-12-04T12:53:25.2629245Z { 2025-12-04T12:53:25.2629370Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2629528Z "size": 26973054, 2025-12-04T12:53:25.2629690Z "digest": "sha256:24a03847d382b73c11969f8f73916a6bedf5ccea12f6f4290b3880f29ceda32a" 2025-12-04T12:53:25.2629863Z }, 2025-12-04T12:53:25.2629945Z { 2025-12-04T12:53:25.2630072Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2630270Z "size": 104, 2025-12-04T12:53:25.2630431Z "digest": "sha256:816e2e34e01839a35d624dbf4bd9ac9bea4c975104af47a0e6b6b6dee6c6f98d" 2025-12-04T12:53:25.2630609Z }, 2025-12-04T12:53:25.2630691Z { 2025-12-04T12:53:25.2630820Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2630976Z "size": 424, 2025-12-04T12:53:25.2631137Z "digest": "sha256:b168858b85373f8ddca549d79267a06de4fa945d04bf791c55c9ddc93957fa3c" 2025-12-04T12:53:25.2631313Z }, 2025-12-04T12:53:25.2631395Z { 2025-12-04T12:53:25.2631575Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2631734Z "size": 19309386, 2025-12-04T12:53:25.2631902Z "digest": "sha256:6b8d5ff02e267e38322afbb8a58ed63ce9d75b10e9e73255e6affcbc6b6539bf" 2025-12-04T12:53:25.2632085Z }, 2025-12-04T12:53:25.2632168Z { 2025-12-04T12:53:25.2632297Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2632455Z "size": 826, 2025-12-04T12:53:25.2632615Z "digest": "sha256:4e3b10a5dd6aed29f238d604925e2a4f873141c1087c8dd4fdde5c61e7560893" 2025-12-04T12:53:25.2632793Z }, 2025-12-04T12:53:25.2632876Z { 2025-12-04T12:53:25.2633005Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2633163Z "size": 724, 2025-12-04T12:53:25.2633319Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T12:53:25.2633494Z }, 2025-12-04T12:53:25.2633576Z { 2025-12-04T12:53:25.2633705Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2633872Z "size": 149, 2025-12-04T12:53:25.2634032Z "digest": "sha256:3092fab73b59190b9facfc49bf18f58612172bc2fd68dfa339a1118632616939" 2025-12-04T12:53:25.2634210Z }, 2025-12-04T12:53:25.2634292Z { 2025-12-04T12:53:25.2634421Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2634578Z "size": 136, 2025-12-04T12:53:25.2634742Z "digest": "sha256:20020dd28a15ba092fcbfe906ee39cdddfcc9d0b7eb42fdd6f4c08a984fa9c00" 2025-12-04T12:53:25.2634924Z }, 2025-12-04T12:53:25.2635007Z { 2025-12-04T12:53:25.2635136Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2635292Z "size": 140, 2025-12-04T12:53:25.2635451Z "digest": "sha256:ae5280ce969dcff08c091e9a5f7641f13561b2b0ee44d78b7c3f81d8fe8e6d32" 2025-12-04T12:53:25.2635629Z }, 2025-12-04T12:53:25.2635705Z { 2025-12-04T12:53:25.2635832Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2636052Z "size": 32, 2025-12-04T12:53:25.2636217Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T12:53:25.2636396Z }, 2025-12-04T12:53:25.2636482Z { 2025-12-04T12:53:25.2636610Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2636769Z "size": 222, 2025-12-04T12:53:25.2636931Z "digest": "sha256:fe17d9eb0fd26d3af4c724bf570d833978b131cedb7dc17a800aa388a246b3cd" 2025-12-04T12:53:25.2637112Z }, 2025-12-04T12:53:25.2637200Z { 2025-12-04T12:53:25.2637333Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2637490Z "size": 346, 2025-12-04T12:53:25.2637648Z "digest": "sha256:a51e0dab2d596e6563483f27c12660007160847d177ba4c31812a8f44ada5754" 2025-12-04T12:53:25.2637824Z }, 2025-12-04T12:53:25.2637908Z { 2025-12-04T12:53:25.2638036Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2638193Z "size": 88300, 2025-12-04T12:53:25.2638373Z "digest": "sha256:6eb176cefd72d37ecbcdf074289a8f1de732d8816cc695ece7e4709d098094d6" 2025-12-04T12:53:25.2638553Z }, 2025-12-04T12:53:25.2638636Z { 2025-12-04T12:53:25.2638760Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2638917Z "size": 106, 2025-12-04T12:53:25.2639077Z "digest": "sha256:e7b8cf2e8d5a4c56db9726ce62c1176032408b3b1c25a000592361cb4245e2b5" 2025-12-04T12:53:25.2639252Z }, 2025-12-04T12:53:25.2639333Z { 2025-12-04T12:53:25.2639462Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2639617Z "size": 1671, 2025-12-04T12:53:25.2639781Z "digest": "sha256:ef3a5060abce88884bc8bd815aa41c46427f34eeb132fe0ddd85a3f86e6dc83d" 2025-12-04T12:53:25.2639962Z }, 2025-12-04T12:53:25.2640049Z { 2025-12-04T12:53:25.2640225Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2640382Z "size": 724, 2025-12-04T12:53:25.2640576Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T12:53:25.2640755Z }, 2025-12-04T12:53:25.2640838Z { 2025-12-04T12:53:25.2640968Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2641123Z "size": 138, 2025-12-04T12:53:25.2641292Z "digest": "sha256:a6f4ec14b42b8f0a83d20aa6a985ddb6a1bf64e0ed3d44afd3484b87d4ed5ad3" 2025-12-04T12:53:25.2641471Z }, 2025-12-04T12:53:25.2641554Z { 2025-12-04T12:53:25.2641681Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2641839Z "size": 119, 2025-12-04T12:53:25.2641995Z "digest": "sha256:7e5a0c956cfbd6f8074fbfd3b1d416e6635d632835ec00c8dd4c015a21da19b4" 2025-12-04T12:53:25.2642171Z }, 2025-12-04T12:53:25.2642251Z { 2025-12-04T12:53:25.2642379Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2642536Z "size": 6238423049, 2025-12-04T12:53:25.2642704Z "digest": "sha256:b4f78730cfe76ce091b78b2e2e3d52be03f1097b3e4c3de5bd79f8d13a853132" 2025-12-04T12:53:25.2642887Z }, 2025-12-04T12:53:25.2642969Z { 2025-12-04T12:53:25.2643100Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2643254Z "size": 174, 2025-12-04T12:53:25.2643407Z "digest": "sha256:081028f24389b112683689fd362e8c0d6f358082710e72feab91cea6383feb4d" 2025-12-04T12:53:25.2643579Z }, 2025-12-04T12:53:25.2643661Z { 2025-12-04T12:53:25.2643789Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2643946Z "size": 1896, 2025-12-04T12:53:25.2644112Z "digest": "sha256:a534dcf4b9a9e5fabed742c8a8fc43c9cfe7346ea88ab3c177c3b14fd3afe00a" 2025-12-04T12:53:25.2644293Z }, 2025-12-04T12:53:25.2644375Z { 2025-12-04T12:53:25.2644502Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2644659Z "size": 197577597, 2025-12-04T12:53:25.2644826Z "digest": "sha256:2e77500302cc13224427e1d74e471bd79d5109ba6a5099a83df1d10b786f71ba" 2025-12-04T12:53:25.2645040Z }, 2025-12-04T12:53:25.2645123Z { 2025-12-04T12:53:25.2645251Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2645409Z "size": 304, 2025-12-04T12:53:25.2645574Z "digest": "sha256:bc08246bb4ba18c3ec5bc69e16b6b4e929c5bd0f3fae10eeb0b1a622a63d6fa2" 2025-12-04T12:53:25.2645756Z }, 2025-12-04T12:53:25.2645838Z { 2025-12-04T12:53:25.2645972Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2646129Z "size": 32, 2025-12-04T12:53:25.2646292Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T12:53:25.2646470Z }, 2025-12-04T12:53:25.2646554Z { 2025-12-04T12:53:25.2646682Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2646839Z "size": 106, 2025-12-04T12:53:25.2647001Z "digest": "sha256:ff0c473ca120ebdcaa2ba10b3274e82032edd5196019e76d4e7584553704ae81" 2025-12-04T12:53:25.2647179Z }, 2025-12-04T12:53:25.2647266Z { 2025-12-04T12:53:25.2647400Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T12:53:25.2647558Z "size": 54145662, 2025-12-04T12:53:25.2647727Z "digest": "sha256:6bbc14b250efb3cdaad12c91573c6bb9129ad3e3432f0ed1a7eaebc9958d162f" 2025-12-04T12:53:25.2647910Z } 2025-12-04T12:53:25.2647992Z ] 2025-12-04T12:53:25.2648075Z } 2025-12-04T12:53:25.2648166Z + exit 0 2025-12-04T12:53:25.2665831Z ##[group]Run set -eux 2025-12-04T12:53:25.2665955Z set -eux 2025-12-04T12:53:25.2666117Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-12-04T12:53:25.2666535Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-12-04T12:53:25.2671843Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:25.2671998Z env: 2025-12-04T12:53:25.2672106Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:25.2672301Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:25.2672477Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:25.2672645Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:25.2673155Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:25.2673650Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:25.2673771Z AWS_REGION: us-east-1 2025-12-04T12:53:25.2673981Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:25.2674148Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:25.2676395Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:25.2676508Z ##[endgroup] 2025-12-04T12:53:25.2697445Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-12-04T12:53:25.2697774Z /home/runner/_work/_temp/fe8aaec9-a1dc-4af5-b14d-56605f6d1201.sh: line 3: aws: command not found 2025-12-04T12:53:25.2697988Z + jq --raw-output .SecretString 2025-12-04T12:53:25.2698816Z + jq -r .docker_hub_readonly_token 2025-12-04T12:53:25.2698991Z + docker login --username pytorchbot --password-stdin 2025-12-04T12:53:25.2788283Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T12:53:25.2794961Z + true 2025-12-04T12:53:25.2866211Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-12-04T12:53:25.2866409Z with: 2025-12-04T12:53:25.2866690Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:25.2867023Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:25.2867182Z env: 2025-12-04T12:53:25.2867279Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:25.2867579Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:25.2867762Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:25.2867937Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:25.2868471Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:25.2868980Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:25.2869107Z AWS_REGION: us-east-1 2025-12-04T12:53:25.2869323Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:25.2869485Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:25.2871757Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:25.2871866Z ##[endgroup] 2025-12-04T12:53:25.2878665Z ##[group]Run set -x 2025-12-04T12:53:25.2878796Z set -x 2025-12-04T12:53:25.2878903Z set +e 2025-12-04T12:53:25.2878998Z  2025-12-04T12:53:25.2879096Z login() { 2025-12-04T12:53:25.2879285Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T12:53:25.2879489Z } 2025-12-04T12:53:25.2879583Z  2025-12-04T12:53:25.2879678Z retry () { 2025-12-04T12:53:25.2879798Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T12:53:25.2879933Z } 2025-12-04T12:53:25.2880031Z  2025-12-04T12:53:25.2880139Z retry login "${DOCKER_REGISTRY}" 2025-12-04T12:53:25.2880310Z  2025-12-04T12:53:25.2880496Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-12-04T12:53:25.2880744Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-12-04T12:53:25.2880885Z  2025-12-04T12:53:25.2880970Z set -e 2025-12-04T12:53:25.2881107Z # ignore output since only exit code is used for conditional 2025-12-04T12:53:25.2881293Z # only pull docker image if it's not available locally 2025-12-04T12:53:25.2881497Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-12-04T12:53:25.2881683Z  retry docker pull "${DOCKER_IMAGE}" 2025-12-04T12:53:25.2881802Z fi 2025-12-04T12:53:25.2886453Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:25.2886598Z env: 2025-12-04T12:53:25.2886688Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:25.2886821Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:25.2886995Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:25.2887158Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:25.2887669Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:25.2888163Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:25.2888283Z AWS_REGION: us-east-1 2025-12-04T12:53:25.2888420Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:25.2888574Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:25.2890789Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:25.2891151Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:25.2891465Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:25.2891614Z ##[endgroup] 2025-12-04T12:53:25.2908291Z + set +e 2025-12-04T12:53:25.2908444Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:25.2908611Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:25.2911188Z + aws ecr get-login-password --region us-east-1 2025-12-04T12:53:25.2911428Z /home/runner/_work/_temp/4df4eda5-05ac-4976-bd97-ebc36ca5d71c.sh: line 5: aws: command not found 2025-12-04T12:53:25.2912402Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:25.2996844Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T12:53:25.3007863Z + sleep 1 2025-12-04T12:53:26.3017680Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:26.3022131Z + aws ecr get-login-password --region us-east-1 2025-12-04T12:53:26.3022810Z /home/runner/_work/_temp/4df4eda5-05ac-4976-bd97-ebc36ca5d71c.sh: line 5: aws: command not found 2025-12-04T12:53:26.3023446Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:26.3097144Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T12:53:26.3111641Z + sleep 2 2025-12-04T12:53:28.3125114Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:28.3128765Z + aws ecr get-login-password --region us-east-1 2025-12-04T12:53:28.3129450Z /home/runner/_work/_temp/4df4eda5-05ac-4976-bd97-ebc36ca5d71c.sh: line 5: aws: command not found 2025-12-04T12:53:28.3130295Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T12:53:28.3223653Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T12:53:28.3240749Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:28.3241805Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-12-04T12:53:29.6479676Z + IMAGE_SIZE=18171.470620155334 2025-12-04T12:53:29.6480309Z + echo 'Compressed size of image in MB: 18171.470620155334' 2025-12-04T12:53:29.6480738Z + set -e 2025-12-04T12:53:29.6481551Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:53:29.6482477Z Compressed size of image in MB: 18171.470620155334 2025-12-04T12:53:29.6664152Z Prepare all required actions 2025-12-04T12:53:29.6680814Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-12-04T12:53:29.6680962Z with: 2025-12-04T12:53:29.6681285Z github-token: *** 2025-12-04T12:53:29.6681394Z env: 2025-12-04T12:53:29.6681497Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:29.6681648Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:29.6681830Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:29.6682004Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:29.6682518Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:29.6683035Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:29.6683187Z AWS_REGION: us-east-1 2025-12-04T12:53:29.6683803Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:29.6684270Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:29.6688406Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:29.6688613Z ##[endgroup] 2025-12-04T12:53:29.6699542Z ##[group]Run set -eux 2025-12-04T12:53:29.6699722Z set -eux 2025-12-04T12:53:29.6699995Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-12-04T12:53:29.6705349Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:53:29.6705534Z env: 2025-12-04T12:53:29.6705651Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:29.6705820Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:29.6706043Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:29.6706246Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:29.6706995Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:29.6707597Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:29.6707746Z AWS_REGION: us-east-1 2025-12-04T12:53:29.6707909Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:29.6708099Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:29.6710777Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:29.6710982Z GITHUB_TOKEN: *** 2025-12-04T12:53:29.6711106Z ##[endgroup] 2025-12-04T12:53:29.6733140Z + python3 .github/scripts/get_workflow_job_id.py 19922849170 linux.rocm.gpu.gfx942.4.b-bphpw-runner-rpncb 2025-12-04T12:53:30.8947630Z Setting output job-id=57116213181 2025-12-04T12:53:30.8948386Z Setting output job-name=linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:53:30.9059556Z Prepare all required actions 2025-12-04T12:53:30.9059802Z Getting action download info 2025-12-04T12:53:31.1050308Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-12-04T12:53:31.9574975Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-12-04T12:53:32.7775470Z ##[group]Run ./.github/actions/download-build-artifacts 2025-12-04T12:53:32.7775629Z with: 2025-12-04T12:53:32.7775736Z name: linux-jammy-rocm-py3.10 2025-12-04T12:53:32.7775859Z s3-bucket: gha-artifacts 2025-12-04T12:53:32.7775966Z env: 2025-12-04T12:53:32.7776058Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:32.7776195Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:32.7776371Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:32.7776536Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:32.7777073Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:32.7777578Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:32.7777696Z AWS_REGION: us-east-1 2025-12-04T12:53:32.7777848Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:32.7778001Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:32.7783643Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:32.7783757Z ##[endgroup] 2025-12-04T12:53:32.7812456Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T12:53:32.7812590Z with: 2025-12-04T12:53:32.7812691Z name: linux-jammy-rocm-py3.10 2025-12-04T12:53:32.7812813Z s3-bucket: gha-artifacts 2025-12-04T12:53:32.7812923Z region: us-east-1 2025-12-04T12:53:32.7813016Z env: 2025-12-04T12:53:32.7813108Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:53:32.7813248Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:53:32.7813423Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:53:32.7813587Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:53:32.7814093Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:53:32.7814588Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:53:32.7835289Z AWS_REGION: us-east-1 2025-12-04T12:53:32.7835439Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:53:32.7835590Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:53:32.7837753Z AWS_SESSION_TOKEN: *** 2025-12-04T12:53:32.7837854Z ##[endgroup] 2025-12-04T12:53:33.0072581Z (node:17078) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T12:53:33.0073290Z 2025-12-04T12:53:33.0073462Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T12:53:33.0073890Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T12:53:33.0074392Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T12:53:33.2729102Z Found 1 objects with prefix pytorch/pytorch/19922849170/linux-jammy-rocm-py3.10/ 2025-12-04T12:53:33.2729793Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T12:54:05.8326101Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T12:54:05.8330743Z Artifact download has finished successfully 2025-12-04T12:54:05.8598697Z ##[group]Run unzip -o artifacts.zip 2025-12-04T12:54:05.8598916Z unzip -o artifacts.zip 2025-12-04T12:54:05.8603669Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:05.8603837Z env: 2025-12-04T12:54:05.8604131Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:05.8604284Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:05.8604481Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:05.8604670Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:05.8605241Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:05.8605801Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:05.8605934Z AWS_REGION: us-east-1 2025-12-04T12:54:05.8606113Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:05.8606285Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:05.8608750Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:05.8608871Z ##[endgroup] 2025-12-04T12:54:05.8643066Z Archive: artifacts.zip 2025-12-04T12:54:05.8644374Z creating: dist/ 2025-12-04T12:54:05.8728325Z inflating: dist/.ninja_log 2025-12-04T12:54:08.8027632Z inflating: dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T12:54:08.8028113Z creating: build/ 2025-12-04T12:54:08.8030837Z creating: build/custom_test_artifacts/ 2025-12-04T12:54:08.8031252Z creating: build/custom_test_artifacts/custom-op-build/ 2025-12-04T12:54:08.8031729Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-12-04T12:54:08.8032265Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-12-04T12:54:08.8032865Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T12:54:08.8033441Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/ 2025-12-04T12:54:08.8034010Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T12:54:08.8034656Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T12:54:08.8035258Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T12:54:08.8035933Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T12:54:08.8036631Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T12:54:08.8037274Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T12:54:08.8037896Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T12:54:08.8038500Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T12:54:08.8039208Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T12:54:08.8039875Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T12:54:08.8041010Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T12:54:08.8041537Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T12:54:08.8042092Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T12:54:08.8042569Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-12-04T12:54:08.8042953Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-12-04T12:54:08.8043356Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-12-04T12:54:08.8043776Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-12-04T12:54:08.8044425Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-12-04T12:54:08.8044947Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-12-04T12:54:08.8045442Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-12-04T12:54:08.8045906Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-12-04T12:54:08.8046384Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-12-04T12:54:08.8046866Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-12-04T12:54:08.8047350Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-12-04T12:54:08.8047830Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-12-04T12:54:08.8048311Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-12-04T12:54:08.8051845Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-12-04T12:54:08.8158962Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-12-04T12:54:08.8159313Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-12-04T12:54:08.8159676Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-12-04T12:54:08.8160072Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-12-04T12:54:08.8160504Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-12-04T12:54:08.8160853Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-12-04T12:54:08.8161221Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-12-04T12:54:08.8161607Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-12-04T12:54:08.8161971Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-12-04T12:54:08.8162330Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-12-04T12:54:08.8162685Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-12-04T12:54:08.8173178Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-12-04T12:54:08.8216639Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-12-04T12:54:08.8217012Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T12:54:08.8217412Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-12-04T12:54:08.8217708Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-12-04T12:54:08.8217980Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-12-04T12:54:08.8218303Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-12-04T12:54:08.8218577Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_outer_vec.cc 2025-12-04T12:54:08.8218850Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_vec_ext.cc 2025-12-04T12:54:08.8219680Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-12-04T12:54:08.8220690Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-12-04T12:54:08.8221127Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-12-04T12:54:08.8312443Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-12-04T12:54:08.8341589Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-12-04T12:54:08.8341904Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-12-04T12:54:08.8342179Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-12-04T12:54:08.8342489Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-12-04T12:54:08.8344250Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T12:54:08.8344630Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/ 2025-12-04T12:54:08.8344970Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T12:54:08.8345339Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T12:54:08.8345694Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T12:54:08.8346118Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T12:54:08.8347197Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T12:54:08.8347805Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T12:54:08.8348320Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T12:54:08.8348808Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T12:54:08.8349352Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T12:54:08.8349904Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T12:54:08.8350450Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T12:54:08.8351041Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T12:54:08.8351625Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T12:54:08.8352129Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-12-04T12:54:08.8352539Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-12-04T12:54:08.8352966Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-12-04T12:54:08.8353414Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-12-04T12:54:08.8353915Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-12-04T12:54:08.8354480Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-12-04T12:54:08.8355368Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-12-04T12:54:08.8355873Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-12-04T12:54:08.8356398Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-12-04T12:54:08.8356920Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-12-04T12:54:08.8357439Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-12-04T12:54:08.8357958Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-12-04T12:54:08.8358474Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-12-04T12:54:08.8365863Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-12-04T12:54:08.8399750Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-12-04T12:54:08.8400143Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T12:54:08.8400521Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-12-04T12:54:08.8400850Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-12-04T12:54:08.8401141Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-12-04T12:54:08.8401500Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-12-04T12:54:08.8401797Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_outer_vec.cc 2025-12-04T12:54:08.8402098Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_vec_ext.cc 2025-12-04T12:54:08.8402794Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-12-04T12:54:08.8403103Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-12-04T12:54:08.8403356Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-12-04T12:54:08.8423957Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-12-04T12:54:08.8424183Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-12-04T12:54:08.8424412Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-12-04T12:54:08.8424673Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-12-04T12:54:08.8426714Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T12:54:08.8427013Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/ 2025-12-04T12:54:08.8427295Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T12:54:08.8427611Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T12:54:08.8427914Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T12:54:08.8428598Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T12:54:08.8429374Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T12:54:08.8429691Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T12:54:08.8429999Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T12:54:08.8430356Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T12:54:08.8431278Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T12:54:08.8432044Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T12:54:08.8432370Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T12:54:08.8433345Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T12:54:08.8434132Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T12:54:08.8434454Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-12-04T12:54:08.8434715Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-12-04T12:54:08.8434987Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-12-04T12:54:08.8435281Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-12-04T12:54:08.8435652Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-12-04T12:54:08.8436022Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-12-04T12:54:08.8436367Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-12-04T12:54:08.8436695Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-12-04T12:54:08.8437033Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-12-04T12:54:08.8437374Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-12-04T12:54:08.8437718Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-12-04T12:54:08.8438063Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-12-04T12:54:08.8438398Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-12-04T12:54:08.8438830Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-12-04T12:54:08.8503233Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-12-04T12:54:08.8503543Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-12-04T12:54:08.8503857Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-12-04T12:54:08.8504199Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-12-04T12:54:08.8504533Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-12-04T12:54:08.8504846Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-12-04T12:54:08.8505165Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-12-04T12:54:08.8505490Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-12-04T12:54:08.8505813Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-12-04T12:54:08.8506136Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-12-04T12:54:08.8506452Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-12-04T12:54:08.8517007Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-12-04T12:54:08.8546565Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-12-04T12:54:08.8546906Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T12:54:08.8547204Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-12-04T12:54:08.8547483Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-12-04T12:54:08.8547737Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-12-04T12:54:08.8548099Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-12-04T12:54:08.8548363Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_outer_vec.cc 2025-12-04T12:54:08.8548621Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_vec_ext.cc 2025-12-04T12:54:08.8549447Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-12-04T12:54:08.8549811Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-12-04T12:54:08.8550041Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-12-04T12:54:08.8604235Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-12-04T12:54:08.8625251Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-12-04T12:54:08.8625433Z creating: build/lib/ 2025-12-04T12:54:08.8670652Z inflating: build/lib/libprotobuf-lite.a 2025-12-04T12:54:08.8914885Z inflating: build/lib/libprotobuf.a 2025-12-04T12:54:08.9188633Z inflating: build/lib/libprotoc.a 2025-12-04T12:54:08.9194086Z inflating: build/lib/libpthreadpool.a 2025-12-04T12:54:08.9197991Z inflating: build/lib/libcpuinfo.a 2025-12-04T12:54:08.9202119Z inflating: build/lib/libcpuinfo_internals.a 2025-12-04T12:54:08.9202672Z inflating: build/lib/libclog.a 2025-12-04T12:54:08.9213069Z inflating: build/lib/libpytorch_qnnpack.a 2025-12-04T12:54:08.9213957Z inflating: build/lib/libnnpack_reference_layers.a 2025-12-04T12:54:08.9223765Z inflating: build/lib/libnnpack.a 2025-12-04T12:54:08.9325401Z inflating: build/lib/libmicrokernels-prod.a 2025-12-04T12:54:08.9794821Z inflating: build/lib/libmicrokernels-all.a 2025-12-04T12:54:08.9832863Z inflating: build/lib/libgtest.a 2025-12-04T12:54:08.9841983Z inflating: build/lib/libgmock.a 2025-12-04T12:54:08.9842188Z inflating: build/lib/libgtest_main.a 2025-12-04T12:54:08.9842510Z inflating: build/lib/libgmock_main.a 2025-12-04T12:54:08.9892098Z inflating: build/lib/libXNNPACK.a 2025-12-04T12:54:08.9933693Z inflating: build/lib/libbenchmark.a 2025-12-04T12:54:08.9933907Z inflating: build/lib/libbenchmark_main.a 2025-12-04T12:54:08.9934091Z inflating: build/lib/libjitprofiling.a 2025-12-04T12:54:08.9938241Z inflating: build/lib/libittnotify.a 2025-12-04T12:54:08.9974780Z inflating: build/lib/libasmjit.a 2025-12-04T12:54:09.0598854Z inflating: build/lib/libfbgemm.a 2025-12-04T12:54:09.0615547Z inflating: build/lib/libtensorpipe_uv.a 2025-12-04T12:54:09.0912093Z inflating: build/lib/libtensorpipe.a 2025-12-04T12:54:09.0978145Z inflating: build/lib/libgloo.a 2025-12-04T12:54:09.1003798Z inflating: build/lib/libonnx_proto.a 2025-12-04T12:54:09.1225912Z inflating: build/lib/libgloo_hip.a 2025-12-04T12:54:09.1618837Z inflating: build/lib/libonnx.a 2025-12-04T12:54:09.7162938Z inflating: build/lib/libdnnl.a 2025-12-04T12:54:09.7173833Z inflating: build/lib/libfmt.a 2025-12-04T12:54:09.7345507Z inflating: build/lib/libkineto.a 2025-12-04T12:54:09.7410571Z inflating: build/lib/libc10.so 2025-12-04T12:54:09.7412351Z inflating: build/lib/libtorch_global_deps.so 2025-12-04T12:54:09.7412698Z inflating: build/lib/libcaffe2_nvrtc.so 2025-12-04T12:54:09.7438168Z inflating: build/lib/libc10_hip.so 2025-12-04T12:54:09.7712737Z inflating: build/lib/libfbgemm_genai.a 2025-12-04T12:54:11.4716383Z inflating: build/lib/libtorch_cpu.so 2025-12-04T12:54:11.4718939Z inflating: build/lib/libshm.so 2025-12-04T12:54:12.3001415Z inflating: build/lib/libtorch_hip.so 2025-12-04T12:54:12.3001757Z inflating: build/lib/libtorch.so 2025-12-04T12:54:12.3012247Z inflating: build/lib/libjitbackend_test.so 2025-12-04T12:54:12.3025750Z inflating: build/lib/libbackend_with_compiler.so 2025-12-04T12:54:12.3065000Z inflating: build/lib/libtorchbind_test.so 2025-12-04T12:54:12.3079534Z inflating: build/lib/libaoti_custom_ops.so 2025-12-04T12:54:12.4371273Z inflating: build/lib/libtorch_python.so 2025-12-04T12:54:12.4391150Z inflating: build/lib/libnnapi_backend.so 2025-12-04T12:54:12.4391408Z creating: build/bin/ 2025-12-04T12:54:12.4391609Z creating: build/bin/CMakeFiles/ 2025-12-04T12:54:12.4392152Z inflating: build/bin/cmake_install.cmake 2025-12-04T12:54:12.4392393Z inflating: build/bin/CTestTestfile.cmake 2025-12-04T12:54:12.4644127Z inflating: build/bin/protoc-3.13.0.0 2025-12-04T12:54:12.4896529Z inflating: build/bin/protoc 2025-12-04T12:54:12.4929273Z inflating: build/bin/c10_AllocatorConfig_test 2025-12-04T12:54:12.4960086Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-12-04T12:54:12.4991617Z inflating: build/bin/c10_DeviceGuard_test 2025-12-04T12:54:12.5023313Z inflating: build/bin/c10_Device_test 2025-12-04T12:54:12.5059382Z inflating: build/bin/c10_DispatchKeySet_test 2025-12-04T12:54:12.5092522Z inflating: build/bin/c10_Scalar_test 2025-12-04T12:54:12.5122666Z inflating: build/bin/c10_StreamGuard_test 2025-12-04T12:54:12.5157036Z inflating: build/bin/c10_SymInt_test 2025-12-04T12:54:12.5191098Z inflating: build/bin/c10_SizesAndStrides_test 2025-12-04T12:54:12.5223409Z inflating: build/bin/c10_Bitset_test 2025-12-04T12:54:12.5265306Z inflating: build/bin/c10_cow_test 2025-12-04T12:54:12.5298421Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-12-04T12:54:12.5332586Z inflating: build/bin/c10_InlineStreamGuard_test 2025-12-04T12:54:12.5362886Z inflating: build/bin/c10_ArrayRef_test 2025-12-04T12:54:12.5392891Z inflating: build/bin/c10_ConstexprCrc_test 2025-12-04T12:54:12.5423473Z inflating: build/bin/c10_DeadlockDetection_test 2025-12-04T12:54:12.5455728Z inflating: build/bin/c10_IntrusiveList_test 2025-12-04T12:54:12.5486886Z inflating: build/bin/c10_Half_test 2025-12-04T12:54:12.5522012Z inflating: build/bin/c10_Enumerate_test 2025-12-04T12:54:12.5555890Z inflating: build/bin/c10_LeftRight_test 2025-12-04T12:54:12.5588312Z inflating: build/bin/c10_NetworkFlow_test 2025-12-04T12:54:12.5618798Z inflating: build/bin/c10_Semaphore_test 2025-12-04T12:54:12.5649478Z inflating: build/bin/c10_Synchronized_test 2025-12-04T12:54:12.5681383Z inflating: build/bin/c10_TypeIndex_test 2025-12-04T12:54:12.5715539Z inflating: build/bin/c10_ThreadLocal_test 2025-12-04T12:54:12.5747052Z inflating: build/bin/c10_accumulate_test 2025-12-04T12:54:12.5781070Z inflating: build/bin/c10_bfloat16_test 2025-12-04T12:54:12.5811457Z inflating: build/bin/c10_error_test 2025-12-04T12:54:12.5842480Z inflating: build/bin/c10_bit_cast_test 2025-12-04T12:54:12.5876070Z inflating: build/bin/c10_complex_test 2025-12-04T12:54:12.5908128Z inflating: build/bin/c10_exception_test 2025-12-04T12:54:12.5942696Z inflating: build/bin/c10_complex_math_test 2025-12-04T12:54:12.5973611Z inflating: build/bin/c10_flags_test 2025-12-04T12:54:12.6004852Z inflating: build/bin/c10_irange_test 2025-12-04T12:54:12.6035924Z inflating: build/bin/c10_generic_math_test 2025-12-04T12:54:12.6125270Z inflating: build/bin/c10_intrusive_ptr_test 2025-12-04T12:54:12.6159818Z inflating: build/bin/c10_logging_test 2025-12-04T12:54:12.6190537Z inflating: build/bin/c10_nofatal_test 2025-12-04T12:54:12.6223102Z inflating: build/bin/c10_lazy_test 2025-12-04T12:54:12.6260628Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-12-04T12:54:12.6292911Z inflating: build/bin/c10_registry_test 2025-12-04T12:54:12.6324452Z inflating: build/bin/c10_ssize_test 2025-12-04T12:54:12.6369176Z inflating: build/bin/c10_optional_test 2025-12-04T12:54:12.6456908Z inflating: build/bin/c10_small_vector_test 2025-12-04T12:54:12.6491354Z inflating: build/bin/c10_string_util_test 2025-12-04T12:54:12.6522257Z inflating: build/bin/c10_tempfile_test 2025-12-04T12:54:12.6552316Z inflating: build/bin/c10_string_view_test 2025-12-04T12:54:12.6579174Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-12-04T12:54:12.6613265Z inflating: build/bin/c10_typeid_test 2025-12-04T12:54:12.6643234Z inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test 2025-12-04T12:54:12.6673364Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream 2025-12-04T12:54:12.6703680Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2025-12-04T12:54:12.6733609Z inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes 2025-12-04T12:54:12.6763740Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2025-12-04T12:54:12.6793665Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2025-12-04T12:54:12.6823679Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2025-12-04T12:54:12.6853851Z inflating: build/bin/c10_hip_HIPTest 2025-12-04T12:54:12.7180756Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-12-04T12:54:12.7515503Z inflating: build/bin/vec_test_all_types_AVX512 2025-12-04T12:54:12.7857366Z inflating: build/bin/vec_test_all_types_AVX2 2025-12-04T12:54:12.7915067Z inflating: build/bin/test_aoti_abi_check 2025-12-04T12:54:12.7945262Z inflating: build/bin/test_vec_half_DEFAULT 2025-12-04T12:54:12.7975890Z inflating: build/bin/test_vec_half_AVX2 2025-12-04T12:54:12.8006602Z inflating: build/bin/test_vec_half_AVX512 2025-12-04T12:54:12.8038786Z inflating: build/bin/BackoffTest 2025-12-04T12:54:12.8071388Z inflating: build/bin/FileStoreTest 2025-12-04T12:54:12.8106196Z inflating: build/bin/TCPStoreTest 2025-12-04T12:54:12.8139004Z inflating: build/bin/HashStoreTest 2025-12-04T12:54:12.8179415Z inflating: build/bin/ProcessGroupGlooTest 2025-12-04T12:54:12.8181163Z inflating: build/bin/example_allreduce 2025-12-04T12:54:12.8183108Z inflating: build/bin/torch_shm_manager 2025-12-04T12:54:12.8216525Z inflating: build/bin/static_runtime_bench 2025-12-04T12:54:12.8360024Z inflating: build/bin/static_runtime_test 2025-12-04T12:54:12.8403636Z inflating: build/bin/Dict_test 2025-12-04T12:54:12.8435642Z inflating: build/bin/Dimname_test 2025-12-04T12:54:12.8474735Z inflating: build/bin/MaybeOwned_test 2025-12-04T12:54:12.8509333Z inflating: build/bin/NamedTensor_test 2025-12-04T12:54:12.8545228Z inflating: build/bin/apply_utils_test 2025-12-04T12:54:12.8581113Z inflating: build/bin/atest 2025-12-04T12:54:12.8619727Z inflating: build/bin/basic 2025-12-04T12:54:12.8652862Z inflating: build/bin/broadcast_test 2025-12-04T12:54:12.8684161Z inflating: build/bin/cpu_allocator_test 2025-12-04T12:54:12.8719455Z inflating: build/bin/cpu_generator_test 2025-12-04T12:54:12.8751519Z inflating: build/bin/cpu_profiling_allocator_test 2025-12-04T12:54:12.8806425Z inflating: build/bin/cpu_rng_test 2025-12-04T12:54:12.8838095Z inflating: build/bin/dlconvertor_test 2025-12-04T12:54:12.8873044Z inflating: build/bin/extension_backend_test 2025-12-04T12:54:12.8906835Z inflating: build/bin/half_test 2025-12-04T12:54:12.8964497Z inflating: build/bin/ivalue_test 2025-12-04T12:54:12.8994955Z inflating: build/bin/lazy_tensor_test 2025-12-04T12:54:12.9027176Z inflating: build/bin/math_kernel_test 2025-12-04T12:54:12.9059330Z inflating: build/bin/memory_format_test 2025-12-04T12:54:12.9091939Z inflating: build/bin/memory_overlapping_test 2025-12-04T12:54:12.9124351Z inflating: build/bin/mobile_memory_cleanup 2025-12-04T12:54:12.9158423Z inflating: build/bin/native_test 2025-12-04T12:54:12.9189809Z inflating: build/bin/operator_name_test 2025-12-04T12:54:12.9221012Z inflating: build/bin/operators_test 2025-12-04T12:54:12.9253103Z inflating: build/bin/packedtensoraccessor_test 2025-12-04T12:54:12.9293749Z inflating: build/bin/pow_test 2025-12-04T12:54:12.9328282Z inflating: build/bin/quantized_test 2025-12-04T12:54:12.9358982Z inflating: build/bin/reduce_ops_test 2025-12-04T12:54:12.9390438Z inflating: build/bin/reportMemoryUsage_test 2025-12-04T12:54:12.9424310Z inflating: build/bin/scalar_tensor_test 2025-12-04T12:54:12.9459239Z inflating: build/bin/scalar_test 2025-12-04T12:54:12.9490739Z inflating: build/bin/StorageUtils_test 2025-12-04T12:54:12.9522630Z inflating: build/bin/stride_properties_test 2025-12-04T12:54:12.9569915Z inflating: build/bin/tensor_iterator_test 2025-12-04T12:54:12.9602973Z inflating: build/bin/test_parallel 2025-12-04T12:54:12.9634134Z inflating: build/bin/thread_init_test 2025-12-04T12:54:12.9667650Z inflating: build/bin/type_ptr_test 2025-12-04T12:54:12.9703658Z inflating: build/bin/type_test 2025-12-04T12:54:12.9735696Z inflating: build/bin/undefined_tensor_test 2025-12-04T12:54:12.9766049Z inflating: build/bin/verify_api_visibility 2025-12-04T12:54:12.9808900Z inflating: build/bin/legacy_vmap_test 2025-12-04T12:54:12.9840500Z inflating: build/bin/weakref_test 2025-12-04T12:54:12.9872089Z inflating: build/bin/wrapdim_test 2025-12-04T12:54:12.9933513Z inflating: build/bin/List_test 2025-12-04T12:54:12.9964997Z inflating: build/bin/xla_tensor_test 2025-12-04T12:54:13.0001081Z inflating: build/bin/IListRef_test 2025-12-04T12:54:13.0070849Z inflating: build/bin/kernel_function_legacy_test 2025-12-04T12:54:13.0110701Z inflating: build/bin/KernelFunction_test 2025-12-04T12:54:13.0167221Z inflating: build/bin/kernel_function_test 2025-12-04T12:54:13.0240712Z inflating: build/bin/kernel_lambda_legacy_test 2025-12-04T12:54:13.0300844Z inflating: build/bin/kernel_lambda_test 2025-12-04T12:54:13.0337210Z inflating: build/bin/kernel_stackbased_test 2025-12-04T12:54:13.0393408Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-12-04T12:54:13.0424796Z inflating: build/bin/CppSignature_test 2025-12-04T12:54:13.0455047Z inflating: build/bin/op_allowlist_test 2025-12-04T12:54:13.0632210Z inflating: build/bin/op_registration_test 2025-12-04T12:54:13.0662389Z inflating: build/bin/hip_complex_math_test 2025-12-04T12:54:13.0696096Z inflating: build/bin/backend_fallback_test 2025-12-04T12:54:13.0726237Z inflating: build/bin/hip_complex_test 2025-12-04T12:54:13.0767096Z inflating: build/bin/inline_container_test 2025-12-04T12:54:13.0799610Z inflating: build/bin/hip_apply_test 2025-12-04T12:54:13.0829782Z inflating: build/bin/hip_distributions_test 2025-12-04T12:54:13.0859866Z inflating: build/bin/hip_generator_test 2025-12-04T12:54:13.0890040Z inflating: build/bin/hip_half_test 2025-12-04T12:54:13.0920217Z inflating: build/bin/hip_integer_divider_test 2025-12-04T12:54:13.0950309Z inflating: build/bin/hip_optional_test 2025-12-04T12:54:13.0980397Z inflating: build/bin/hip_packedtensoraccessor_test 2025-12-04T12:54:13.1010522Z inflating: build/bin/hip_vectorized_test 2025-12-04T12:54:13.1042242Z inflating: build/bin/hip_dlconvertor_test 2025-12-04T12:54:13.1662704Z inflating: build/bin/test_jit 2025-12-04T12:54:13.1861015Z inflating: build/bin/test_lazy 2025-12-04T12:54:13.1894760Z inflating: build/bin/test_dist_autograd 2025-12-04T12:54:13.1935955Z inflating: build/bin/test_cpp_rpc 2025-12-04T12:54:13.1937304Z inflating: build/bin/parallel_benchmark 2025-12-04T12:54:13.2594070Z inflating: build/bin/test_api 2025-12-04T12:54:13.2594475Z creating: .additional_ci_files/ 2025-12-04T12:54:13.2630058Z inflating: .additional_ci_files/test-times.json 2025-12-04T12:54:13.2761808Z inflating: .additional_ci_files/test-class-times.json 2025-12-04T12:54:13.2789670Z ##[group]Run rm artifacts.zip 2025-12-04T12:54:13.2789855Z rm artifacts.zip 2025-12-04T12:54:13.2794952Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:13.2795146Z env: 2025-12-04T12:54:13.2795261Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:13.2795439Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:13.2795659Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:13.2795872Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:13.2796662Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:13.2797291Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:13.2797437Z AWS_REGION: us-east-1 2025-12-04T12:54:13.2797634Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:13.2797824Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:13.2800267Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:13.2800377Z ##[endgroup] 2025-12-04T12:54:13.3815882Z ##[group]Run df -H 2025-12-04T12:54:13.3816044Z df -H 2025-12-04T12:54:13.3821295Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:13.3821448Z env: 2025-12-04T12:54:13.3821547Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:13.3821690Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:13.3821875Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:13.3822047Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:13.3822580Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:13.3823074Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:13.3823193Z AWS_REGION: us-east-1 2025-12-04T12:54:13.3823391Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:13.3823547Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:13.3825726Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:13.3825840Z ##[endgroup] 2025-12-04T12:54:13.3970274Z Filesystem Size Used Avail Use% Mounted on 2025-12-04T12:54:13.3970512Z overlay 16T 363G 15T 3% / 2025-12-04T12:54:13.3970732Z tmpfs 68M 0 68M 0% /dev 2025-12-04T12:54:13.3970904Z /dev/md0 16T 363G 15T 3% /run 2025-12-04T12:54:13.3971078Z shm 68M 17k 68M 1% /dev/shm 2025-12-04T12:54:13.3971312Z amdprj2-k8s_2 5.5T 120G 5.4T 3% /home/runner/pytorch-data 2025-12-04T12:54:13.3971561Z tmpfs 3.3T 13k 3.3T 1% /run/secrets/kubernetes.io/serviceaccount 2025-12-04T12:54:13.3971920Z tmpfs 1.7T 0 1.7T 0% /proc/acpi 2025-12-04T12:54:13.3972229Z tmpfs 1.7T 0 1.7T 0% /proc/scsi 2025-12-04T12:54:13.3972408Z tmpfs 1.7T 0 1.7T 0% /sys/firmware 2025-12-04T12:54:13.3972616Z tmpfs 1.7T 0 1.7T 0% /sys/devices/virtual/powercap 2025-12-04T12:54:13.4000366Z Prepare all required actions 2025-12-04T12:54:13.4000568Z Getting action download info 2025-12-04T12:54:13.6107280Z ##[group]Run ./.github/actions/download-td-artifacts 2025-12-04T12:54:13.6107427Z with: 2025-12-04T12:54:13.6107521Z env: 2025-12-04T12:54:13.6107614Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:13.6107748Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:13.6107922Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:13.6108179Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:13.6108679Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:13.6109177Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:13.6109314Z AWS_REGION: us-east-1 2025-12-04T12:54:13.6109476Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:13.6109631Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:13.6111859Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:13.6111967Z ##[endgroup] 2025-12-04T12:54:13.6125376Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T12:54:13.6125569Z with: 2025-12-04T12:54:13.6125666Z name: td_results 2025-12-04T12:54:13.6125778Z s3-bucket: gha-artifacts 2025-12-04T12:54:13.6125903Z region: us-east-1 2025-12-04T12:54:13.6126015Z env: 2025-12-04T12:54:13.6126119Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:13.6126264Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:13.6126453Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:13.6126627Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:13.6127155Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:13.6127664Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:13.6127794Z AWS_REGION: us-east-1 2025-12-04T12:54:13.6127935Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:13.6128096Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:13.6130324Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:13.6130440Z ##[endgroup] 2025-12-04T12:54:13.8399358Z (node:17109) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T12:54:13.8399879Z 2025-12-04T12:54:13.8400103Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T12:54:13.8400741Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T12:54:13.8401324Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T12:54:14.1156422Z Found 1 objects with prefix pytorch/pytorch/19922849170/td_results/ 2025-12-04T12:54:14.1156897Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T12:54:14.5534152Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T12:54:14.5537542Z Artifact download has finished successfully 2025-12-04T12:54:14.5714840Z ##[group]Run mkdir -p .additional_ci_files 2025-12-04T12:54:14.5715022Z mkdir -p .additional_ci_files 2025-12-04T12:54:14.5715208Z mv td_results.json .additional_ci_files/td_results.json || true 2025-12-04T12:54:14.5720136Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:14.5720378Z env: 2025-12-04T12:54:14.5720479Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:14.5720625Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:14.5720809Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:14.5720984Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:14.5721679Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:14.5722176Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:14.5722304Z AWS_REGION: us-east-1 2025-12-04T12:54:14.5722580Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:14.5722875Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:14.5725122Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:14.5725236Z ##[endgroup] 2025-12-04T12:54:14.5781196Z ##[group]Run .github/scripts/parse_ref.py 2025-12-04T12:54:14.5781358Z .github/scripts/parse_ref.py 2025-12-04T12:54:14.5785767Z shell: /usr/bin/bash -e {0} 2025-12-04T12:54:14.5785874Z env: 2025-12-04T12:54:14.5785966Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:14.5786102Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:14.5786278Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:14.5786441Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:14.5786938Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:14.5787441Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:14.5787559Z AWS_REGION: us-east-1 2025-12-04T12:54:14.5787736Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:14.5787887Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:14.5790050Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:14.5790155Z ##[endgroup] 2025-12-04T12:54:14.5889788Z Setting output branch=main 2025-12-04T12:54:14.5954528Z Prepare all required actions 2025-12-04T12:54:14.5954790Z Getting action download info 2025-12-04T12:54:14.7916674Z ##[group]Run ./.github/actions/filter-test-configs 2025-12-04T12:54:14.7916818Z with: 2025-12-04T12:54:14.7917063Z github-token: *** 2025-12-04T12:54:14.7920095Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T12:54:14.7923629Z job-name: linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:54:14.7923848Z env: 2025-12-04T12:54:14.7923953Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:14.7924098Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:14.7924278Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:14.7924452Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:14.7924962Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:14.7925449Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:14.7925630Z AWS_REGION: us-east-1 2025-12-04T12:54:14.7925759Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:14.7925913Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:14.7928093Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:14.7928205Z ##[endgroup] 2025-12-04T12:54:14.7954068Z ##[group]Run nick-fields/retry@v3.0.0 2025-12-04T12:54:14.7954199Z with: 2025-12-04T12:54:14.7954294Z shell: bash 2025-12-04T12:54:14.7954393Z timeout_minutes: 10 2025-12-04T12:54:14.7954495Z max_attempts: 5 2025-12-04T12:54:14.7954599Z retry_wait_seconds: 30 2025-12-04T12:54:14.7954899Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T12:54:14.7955214Z polling_interval_seconds: 1 2025-12-04T12:54:14.7955331Z warning_on_retry: true 2025-12-04T12:54:14.7955442Z continue_on_error: false 2025-12-04T12:54:14.7955553Z env: 2025-12-04T12:54:14.7955651Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:14.7955789Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:14.7955971Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:14.7956146Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:14.7956656Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:14.7957156Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:14.7957278Z AWS_REGION: us-east-1 2025-12-04T12:54:14.7957414Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:14.7957576Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:14.7959791Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:14.7959949Z GITHUB_TOKEN: *** 2025-12-04T12:54:14.7960055Z ##[endgroup] 2025-12-04T12:54:14.8340573Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T12:54:14.9748046Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T12:54:15.0691974Z Collecting requests==2.27.1 2025-12-04T12:54:15.1042914Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-12-04T12:54:15.1137709Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.1/63.1 KB 6.9 MB/s eta 0:00:00 2025-12-04T12:54:15.1592810Z Collecting pyyaml==6.0.2 2025-12-04T12:54:15.1703817Z Downloading PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB) 2025-12-04T12:54:15.1916707Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 751.2/751.2 KB 38.0 MB/s eta 0:00:00 2025-12-04T12:54:15.2896150Z Collecting charset-normalizer~=2.0.0 2025-12-04T12:54:15.2949824Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-12-04T12:54:15.3140654Z Collecting certifi>=2017.4.17 2025-12-04T12:54:15.3195406Z Downloading certifi-2025.11.12-py3-none-any.whl (159 kB) 2025-12-04T12:54:15.3212934Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.4/159.4 KB 213.0 MB/s eta 0:00:00 2025-12-04T12:54:15.3483570Z Collecting urllib3<1.27,>=1.21.1 2025-12-04T12:54:15.3537451Z Downloading urllib3-1.26.20-py2.py3-none-any.whl (144 kB) 2025-12-04T12:54:15.3556848Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.2/144.2 KB 162.5 MB/s eta 0:00:00 2025-12-04T12:54:15.3698911Z Collecting idna<4,>=2.5 2025-12-04T12:54:15.3752911Z Downloading idna-3.11-py3-none-any.whl (71 kB) 2025-12-04T12:54:15.3767830Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.0/71.0 KB 108.9 MB/s eta 0:00:00 2025-12-04T12:54:15.4297378Z Installing collected packages: urllib3, pyyaml, idna, charset-normalizer, certifi, requests 2025-12-04T12:54:15.5215311Z WARNING: The script normalizer is installed in '/home/runner/.local/bin' which is not on PATH. 2025-12-04T12:54:15.5215879Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-12-04T12:54:15.5382993Z Successfully installed certifi-2025.11.12 charset-normalizer-2.0.12 idna-3.11 pyyaml-6.0.2 requests-2.27.1 urllib3-1.26.20 2025-12-04T12:54:15.8335131Z Command completed after 1 attempt(s). 2025-12-04T12:54:15.8391348Z ##[group]Run set -x 2025-12-04T12:54:15.8391520Z set -x 2025-12-04T12:54:15.8391659Z  2025-12-04T12:54:15.8391878Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T12:54:15.8392141Z # in runner workspace 2025-12-04T12:54:15.8392366Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-12-04T12:54:15.8396528Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:15.8396740Z env: 2025-12-04T12:54:15.8396881Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:15.8397077Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:15.8397331Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:15.8397578Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:15.8398291Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:15.8398986Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:15.8399153Z AWS_REGION: us-east-1 2025-12-04T12:54:15.8399353Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:15.8399570Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:15.8402311Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:15.8402436Z ##[endgroup] 2025-12-04T12:54:15.8427199Z + python3 /home/runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-12-04T12:54:15.8518531Z Setting output branch=main 2025-12-04T12:54:15.8557859Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T12:54:15.8558138Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T12:54:15.8558334Z echo "Job name: ${JOB_NAME}" 2025-12-04T12:54:15.8558504Z  2025-12-04T12:54:15.8558722Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T12:54:15.8558976Z # in runner workspace 2025-12-04T12:54:15.8559207Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-12-04T12:54:15.8559466Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-12-04T12:54:15.8559658Z  --job-name "${JOB_NAME}" \ 2025-12-04T12:54:15.8563474Z  --test-matrix "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]}" \ 2025-12-04T12:54:15.8566976Z  --selected-test-configs "" \ 2025-12-04T12:54:15.8567127Z  --pr-number "${PR_NUMBER}" \ 2025-12-04T12:54:15.8567271Z  --tag "${TAG}" \ 2025-12-04T12:54:15.8567410Z  --event-name "${EVENT_NAME}" \ 2025-12-04T12:54:15.8567555Z  --schedule "${SCHEDULE}" \ 2025-12-04T12:54:15.8567695Z  --branch "${HEAD_BRANCH}" 2025-12-04T12:54:15.8572607Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:15.8572763Z env: 2025-12-04T12:54:15.8572870Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:15.8573021Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:15.8573207Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:15.8573382Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:15.8573892Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:15.8574446Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:15.8574575Z AWS_REGION: us-east-1 2025-12-04T12:54:15.8574754Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:15.8574917Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:15.8577115Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:15.8577334Z GITHUB_TOKEN: *** 2025-12-04T12:54:15.8577546Z JOB_NAME: linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:54:15.8577769Z PR_NUMBER: 2025-12-04T12:54:15.8577869Z TAG: 2025-12-04T12:54:15.8577968Z EVENT_NAME: schedule 2025-12-04T12:54:15.8578074Z SCHEDULE: 29 8 * * * 2025-12-04T12:54:15.8578181Z HEAD_BRANCH: main 2025-12-04T12:54:15.8578292Z ##[endgroup] 2025-12-04T12:54:15.8602363Z Workflow: trunk-rocm-mi300 2025-12-04T12:54:15.8602904Z Job name: linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:54:16.4135053Z INFO:root:Issue https://github.com/pytorch/pytorch/issues/167616 created by jithunnair-amd has unstable all the test jobs for trunk-rocm-mi300 / linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:54:16.4351593Z Setting output keep-going=True 2025-12-04T12:54:16.4352036Z Setting output ci-verbose-test-logs=False 2025-12-04T12:54:16.4352437Z Setting output ci-test-showlocals=False 2025-12-04T12:54:16.4353310Z Setting output ci-no-test-timeout=False 2025-12-04T12:54:16.4353665Z Setting output ci-no-td=False 2025-12-04T12:54:16.4354007Z Setting output ci-td-distributed=False 2025-12-04T12:54:16.4354358Z Setting output is-unstable=True 2025-12-04T12:54:16.4354701Z Setting output reenabled-issues= 2025-12-04T12:54:16.4369181Z Setting output test-matrix={"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T12:54:16.4378450Z Setting output is-test-matrix-empty=False 2025-12-04T12:54:16.4449590Z ##[group]Run echo "Filtered matrix:" 2025-12-04T12:54:16.4449861Z echo "Filtered matrix:" 2025-12-04T12:54:16.4458578Z echo "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]}" 2025-12-04T12:54:16.4466068Z  2025-12-04T12:54:16.4466160Z echo 2025-12-04T12:54:16.4466278Z echo "Is the current job unstable? True" 2025-12-04T12:54:16.4466413Z  2025-12-04T12:54:16.4466505Z echo 2025-12-04T12:54:16.4466614Z echo "Is keep-going label set? True" 2025-12-04T12:54:16.4466742Z  2025-12-04T12:54:16.4466832Z echo 2025-12-04T12:54:16.4466939Z echo "Reenabled issues? " 2025-12-04T12:54:16.4471675Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:16.4471828Z env: 2025-12-04T12:54:16.4471926Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:16.4472067Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:16.4472255Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:16.4472425Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:16.4472942Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:16.4473439Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:16.4473561Z AWS_REGION: us-east-1 2025-12-04T12:54:16.4473743Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:16.4474013Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:16.4476187Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:16.4476301Z ##[endgroup] 2025-12-04T12:54:16.4494509Z Filtered matrix: 2025-12-04T12:54:16.4503520Z {include: [{config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}]} 2025-12-04T12:54:16.4510339Z 2025-12-04T12:54:16.4510388Z Is the current job unstable? True 2025-12-04T12:54:16.4510472Z 2025-12-04T12:54:16.4510519Z Is keep-going label set? True 2025-12-04T12:54:16.4510596Z 2025-12-04T12:54:16.4510639Z Reenabled issues? 2025-12-04T12:54:16.4532694Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T12:54:16.4532908Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T12:54:16.4535401Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:16.4535553Z env: 2025-12-04T12:54:16.4535652Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:16.4535794Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:16.4535975Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:16.4536148Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:16.4536661Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:16.4537181Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:16.4537305Z AWS_REGION: us-east-1 2025-12-04T12:54:16.4537444Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:16.4537605Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:16.4539761Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:16.4539870Z JOB_TIMEOUT: 600 2025-12-04T12:54:16.4539970Z ##[endgroup] 2025-12-04T12:54:16.4574898Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:54:16.4575167Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:54:16.4575398Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T12:54:16.4579565Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:54:16.4579757Z env: 2025-12-04T12:54:16.4579879Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:16.4580055Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:16.4580335Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:16.4580676Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:16.4581290Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:16.4581798Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:16.4581924Z AWS_REGION: us-east-1 2025-12-04T12:54:16.4582078Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:16.4582242Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:16.4584442Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:16.4584551Z ##[endgroup] 2025-12-04T12:54:16.4653456Z ##[group]Run set -x 2025-12-04T12:54:16.4653599Z set -x 2025-12-04T12:54:16.4653702Z  2025-12-04T12:54:16.4653815Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-12-04T12:54:16.4653979Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-12-04T12:54:16.4654145Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-12-04T12:54:16.4654290Z  TEST_COMMAND=.ci/caffe2/test.sh 2025-12-04T12:54:16.4654411Z else 2025-12-04T12:54:16.4654516Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T12:54:16.4654634Z fi 2025-12-04T12:54:16.4654722Z  2025-12-04T12:54:16.4654854Z # detached container should get cleaned up by teardown_ec2_linux 2025-12-04T12:54:16.4655058Z # TODO: Stop building test binaries as part of the build phase 2025-12-04T12:54:16.4655236Z # Used for GPU_FLAG since that doesn't play nice 2025-12-04T12:54:16.4655402Z # shellcheck disable=SC2086,SC2090 2025-12-04T12:54:16.4655537Z container_name=$(docker run \ 2025-12-04T12:54:16.4655664Z  ${GPU_FLAG:-} \ 2025-12-04T12:54:16.4655783Z  -e BUILD_ENVIRONMENT \ 2025-12-04T12:54:16.4655905Z  -e PR_NUMBER \ 2025-12-04T12:54:16.4656025Z  -e GITHUB_ACTIONS \ 2025-12-04T12:54:16.4656157Z  -e GITHUB_REPOSITORY \ 2025-12-04T12:54:16.4656281Z  -e GITHUB_WORKFLOW \ 2025-12-04T12:54:16.4656400Z  -e GITHUB_JOB \ 2025-12-04T12:54:16.4656513Z  -e GITHUB_RUN_ID \ 2025-12-04T12:54:16.4656630Z  -e GITHUB_RUN_NUMBER \ 2025-12-04T12:54:16.4656756Z  -e GITHUB_RUN_ATTEMPT \ 2025-12-04T12:54:16.4656929Z  -e JOB_ID \ 2025-12-04T12:54:16.4657074Z  -e JOB_NAME \ 2025-12-04T12:54:16.4657280Z  -e BASE_SHA \ 2025-12-04T12:54:16.4657436Z  -e BRANCH \ 2025-12-04T12:54:16.4657573Z  -e SHA1 \ 2025-12-04T12:54:16.4657738Z  -e AWS_DEFAULT_REGION \ 2025-12-04T12:54:16.4657897Z  -e IN_WHEEL_TEST \ 2025-12-04T12:54:16.4658042Z  -e SHARD_NUMBER \ 2025-12-04T12:54:16.4668672Z  -e TEST_CONFIG \ 2025-12-04T12:54:16.4668816Z  -e NUM_TEST_SHARDS \ 2025-12-04T12:54:16.4668947Z  -e REENABLED_ISSUES \ 2025-12-04T12:54:16.4669077Z  -e CONTINUE_THROUGH_ERROR \ 2025-12-04T12:54:16.4669208Z  -e VERBOSE_TEST_LOGS \ 2025-12-04T12:54:16.4669333Z  -e TEST_SHOWLOCALS \ 2025-12-04T12:54:16.4669452Z  -e NO_TEST_TIMEOUT \ 2025-12-04T12:54:16.4669569Z  -e NO_TD \ 2025-12-04T12:54:16.4669693Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-12-04T12:54:16.4669843Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-12-04T12:54:16.4669990Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-12-04T12:54:16.4670129Z  -e TESTS_TO_INCLUDE \ 2025-12-04T12:54:16.4670309Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-12-04T12:54:16.4670434Z  -e DASHBOARD_TAG \ 2025-12-04T12:54:16.4670584Z  --env-file="${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T12:54:16.4670750Z  --ulimit stack=10485760:83886080 \ 2025-12-04T12:54:16.4670878Z  --ulimit core=0 \ 2025-12-04T12:54:16.4671099Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T12:54:16.4671256Z  --security-opt seccomp=unconfined \ 2025-12-04T12:54:16.4671393Z  --cap-add=SYS_PTRACE \ 2025-12-04T12:54:16.4671516Z  --shm-size="8g" \ 2025-12-04T12:54:16.4671627Z  --tty \ 2025-12-04T12:54:16.4671730Z  --detach \ 2025-12-04T12:54:16.4671845Z  --name="${container_name}" \ 2025-12-04T12:54:16.4671976Z  --user jenkins \ 2025-12-04T12:54:16.4672118Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-12-04T12:54:16.4672279Z  -w /var/lib/jenkins/workspace \ 2025-12-04T12:54:16.4672472Z  "${DOCKER_IMAGE}" 2025-12-04T12:54:16.4672584Z ) 2025-12-04T12:54:16.4672691Z # save container name for later step 2025-12-04T12:54:16.4672855Z echo "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV" 2025-12-04T12:54:16.4673127Z # jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home 2025-12-04T12:54:16.4673480Z docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}" 2025-12-04T12:54:16.4676158Z shell: /usr/bin/bash -e {0} 2025-12-04T12:54:16.4676273Z env: 2025-12-04T12:54:16.4676370Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:54:16.4676511Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T12:54:16.4676694Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T12:54:16.4676863Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T12:54:16.4677369Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T12:54:16.4677859Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T12:54:16.4677977Z AWS_REGION: us-east-1 2025-12-04T12:54:16.4678115Z AWS_ACCESS_KEY_ID: *** 2025-12-04T12:54:16.4678273Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T12:54:16.4680492Z AWS_SESSION_TOKEN: *** 2025-12-04T12:54:16.4680618Z BUILD_ENVIRONMENT: linux-jammy-rocm-py3.10 2025-12-04T12:54:16.4680753Z PR_NUMBER: 2025-12-04T12:54:16.4680861Z GITHUB_REPOSITORY: pytorch/pytorch 2025-12-04T12:54:16.4680995Z GITHUB_WORKFLOW: trunk-rocm-mi300 2025-12-04T12:54:16.4681118Z GITHUB_JOB: test 2025-12-04T12:54:16.4681222Z GITHUB_RUN_ID: 19922849170 2025-12-04T12:54:16.4681331Z GITHUB_RUN_NUMBER: 689 2025-12-04T12:54:16.4681444Z GITHUB_RUN_ATTEMPT: 1 2025-12-04T12:54:16.4681549Z JOB_ID: 57116213181 2025-12-04T12:54:16.4681754Z JOB_NAME: linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:54:16.4681969Z BRANCH: main 2025-12-04T12:54:16.4682087Z SHA1: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:16.4682242Z BASE_SHA: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:16.4682382Z TEST_CONFIG: distributed 2025-12-04T12:54:16.4682490Z SHARD_NUMBER: 3 2025-12-04T12:54:16.4682588Z NUM_TEST_SHARDS: 3 2025-12-04T12:54:16.4682689Z REENABLED_ISSUES: 2025-12-04T12:54:16.4682800Z CONTINUE_THROUGH_ERROR: True 2025-12-04T12:54:16.4682917Z VERBOSE_TEST_LOGS: False 2025-12-04T12:54:16.4683027Z TEST_SHOWLOCALS: False 2025-12-04T12:54:16.4683136Z NO_TEST_TIMEOUT: False 2025-12-04T12:54:16.4683242Z NO_TD: False 2025-12-04T12:54:16.4683510Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:54:16.4683805Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 1 2025-12-04T12:54:16.4683941Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-12-04T12:54:16.4684065Z TESTS_TO_INCLUDE: 2025-12-04T12:54:16.4684170Z DASHBOARD_TAG: 2025-12-04T12:54:16.4684384Z HUGGING_FACE_HUB_TOKEN: *** 2025-12-04T12:54:16.4684501Z ##[endgroup] 2025-12-04T12:54:16.4700864Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2025-12-04T12:54:16.4701239Z + [[ linux-jammy-rocm-py3.10 == *onnx* ]] 2025-12-04T12:54:16.4701521Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T12:54:16.4705820Z +++ nproc --ignore=2 2025-12-04T12:54:16.4716038Z ++ docker run --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=126 -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/home/runner/_work/_temp/github_env_19922849170 --ulimit stack=10485760:83886080 --ulimit core=0 --env-file=/tmp/github_env_19922849170 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /home/runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:54:16.6675122Z + container_name=e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T12:54:16.6675529Z + echo CONTAINER_NAME=e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T12:54:16.6676120Z + docker exec -t e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh' 2025-12-04T12:54:19.8979952Z Processing ./dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T12:54:20.4408586Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.18.0) 2025-12-04T12:54:20.4409837Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (4.12.2) 2025-12-04T12:54:20.4413178Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (1.13.3) 2025-12-04T12:54:20.4414253Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (2.8.8) 2025-12-04T12:54:20.4415292Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.1.6) 2025-12-04T12:54:20.4416371Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (2025.10.0) 2025-12-04T12:54:20.4583007Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.10.0a0+gitffd9b0f) (1.3.0) 2025-12-04T12:54:20.4606367Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.10.0a0+gitffd9b0f) (3.0.3) 2025-12-04T12:54:20.6604885Z Installing collected packages: torch 2025-12-04T12:54:26.1578885Z Successfully installed torch-2.10.0a0+gitffd9b0f 2025-12-04T12:54:26.1948154Z + export TERM=vt100 2025-12-04T12:54:26.1948418Z + TERM=vt100 2025-12-04T12:54:26.1952243Z ++ dirname .ci/pytorch/test.sh 2025-12-04T12:54:26.1962305Z + source .ci/pytorch/common.sh 2025-12-04T12:54:26.1965916Z +++ dirname .ci/pytorch/common.sh 2025-12-04T12:54:26.1973653Z ++ source .ci/pytorch/common_utils.sh 2025-12-04T12:54:26.1975390Z +++ declare -f -t trap_add 2025-12-04T12:54:26.1980583Z ++ set -ex -o pipefail 2025-12-04T12:54:26.1980825Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T12:54:26.1981067Z ++ unset HIP_PLATFORM 2025-12-04T12:54:26.1981280Z ++ export PYTORCH_TEST_WITH_ROCM=1 2025-12-04T12:54:26.1981509Z ++ PYTORCH_TEST_WITH_ROCM=1 2025-12-04T12:54:26.1981717Z ++ BUILD_TEST_LIBTORCH=0 2025-12-04T12:54:26.1986286Z ++ dirname .ci/pytorch/test.sh 2025-12-04T12:54:26.1995866Z + source .ci/pytorch/common-build.sh 2025-12-04T12:54:26.1997818Z ++ [[ linux-jammy-rocm-py3.10 != *win-* ]] 2025-12-04T12:54:26.2005840Z ++++ dirname .ci/pytorch/common-build.sh 2025-12-04T12:54:26.2014749Z +++ cd .ci/pytorch 2025-12-04T12:54:26.2015110Z +++ pwd -P 2025-12-04T12:54:26.2018366Z ++ script_dir=/var/lib/jenkins/pytorch/.ci/pytorch 2025-12-04T12:54:26.2018849Z ++ [[ linux-jammy-rocm-py3.10 == *-pch* ]] 2025-12-04T12:54:26.2019194Z ++ which sccache 2025-12-04T12:54:26.2030586Z ++ [[ -z '' ]] 2025-12-04T12:54:26.2030796Z ++ unset SCCACHE_BUCKET 2025-12-04T12:54:26.2031024Z ++ unset SCCACHE_REGION 2025-12-04T12:54:26.2031240Z ++ sccache --stop-server 2025-12-04T12:54:26.2052629Z ++ true 2025-12-04T12:54:26.2052840Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-12-04T12:54:26.2064975Z ++ trap_add sccache_epilogue EXIT 2025-12-04T12:54:26.2065217Z ++ trap_add_cmd=sccache_epilogue 2025-12-04T12:54:26.2065427Z ++ shift 2025-12-04T12:54:26.2065597Z ++ for trap_add_name in "$@" 2025-12-04T12:54:26.2072176Z ++++ trap -p EXIT 2025-12-04T12:54:26.2074407Z +++ eval 'extract_trap_cmd ' 2025-12-04T12:54:26.2074596Z ++++ extract_trap_cmd 2025-12-04T12:54:26.2074773Z ++++ printf '%s\n' '' 2025-12-04T12:54:26.2074938Z +++ printf '%s\n' sccache_epilogue 2025-12-04T12:54:26.2077256Z ++ trap -- ' 2025-12-04T12:54:26.2077400Z sccache_epilogue' EXIT 2025-12-04T12:54:26.2077549Z ++ [[ -n '' ]] 2025-12-04T12:54:26.2077706Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T12:54:26.2077939Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T12:54:26.2078154Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-12-04T12:54:26.2078316Z ++ sccache --start-server 2025-12-04T12:54:26.2094592Z sccache: Starting the server... 2025-12-04T12:54:26.2284666Z sccache: Listening on address 127.0.0.1:4226 2025-12-04T12:54:26.2295985Z ++ sccache --zero-stats 2025-12-04T12:54:26.2311816Z Statistics zeroed. 2025-12-04T12:54:26.2313355Z ++ which ccache 2025-12-04T12:54:26.2321013Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-12-04T12:54:26.2321187Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T12:54:26.2321339Z + echo 'Environment variables:' 2025-12-04T12:54:26.2321489Z Environment variables: 2025-12-04T12:54:26.2321615Z + env 2025-12-04T12:54:26.2328198Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T12:54:26.2328443Z CONTINUE_THROUGH_ERROR=True 2025-12-04T12:54:26.2328606Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-12-04T12:54:26.2328814Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rpncb 2025-12-04T12:54:26.2329111Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2329356Z GITHUB_ACTION=__run_2 2025-12-04T12:54:26.2329482Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T12:54:26.2329642Z GITHUB_RUN_NUMBER=689 2025-12-04T12:54:26.2329789Z TEST_CONFIG=distributed 2025-12-04T12:54:26.2329951Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rpncb 2025-12-04T12:54:26.2330134Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T12:54:26.2330325Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T12:54:26.2330482Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T12:54:26.2330652Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T12:54:26.2330798Z GITHUB_REF_TYPE=branch 2025-12-04T12:54:26.2330942Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2331288Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T12:54:26.2333979Z *** 2025-12-04T12:54:26.2334406Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T12:54:26.2334547Z GITHUB_ACTIONS=true 2025-12-04T12:54:26.2334688Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2334864Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2335116Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk-rocm-mi300.yml@refs/heads/main 2025-12-04T12:54:26.2335345Z UCC_HOME=/usr 2025-12-04T12:54:26.2335468Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T12:54:26.2335607Z VERBOSE_TEST_LOGS=False 2025-12-04T12:54:26.2335731Z GITHUB_REF=refs/heads/main 2025-12-04T12:54:26.2335860Z RUNNER_OS=Linux 2025-12-04T12:54:26.2335970Z SHARD_NUMBER=3 2025-12-04T12:54:26.2336085Z GITHUB_REF_PROTECTED=true 2025-12-04T12:54:26.2336392Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T12:54:26.2336525Z HOME=/var/lib/jenkins 2025-12-04T12:54:26.2336660Z GITHUB_API_URL=https://api.github.com 2025-12-04T12:54:26.2336819Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T12:54:26.2336979Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T12:54:26.2337135Z LANG=C.UTF-8 2025-12-04T12:54:26.2337264Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T12:54:26.2337479Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T12:54:26.2337642Z RUNNER_TRACKING_ID=github_b361d030-1dc8-4630-8751-535f40dec7b0 2025-12-04T12:54:26.2337808Z RUNNER_ARCH=X64 2025-12-04T12:54:26.2337926Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T12:54:26.2338064Z NUM_TEST_SHARDS=3 2025-12-04T12:54:26.2338171Z UCX_HOME=/usr 2025-12-04T12:54:26.2338390Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2338755Z JOB_NAME=linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:54:26.2339009Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T12:54:26.2339235Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2339514Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T12:54:26.2339700Z GITHUB_EVENT_NAME=schedule 2025-12-04T12:54:26.2339863Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T12:54:26.2340031Z DASHBOARD_TAG= 2025-12-04T12:54:26.2340131Z GITHUB_RUN_ID=19922849170 2025-12-04T12:54:26.2340403Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2340630Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T12:54:26.2340747Z PR_NUMBER= 2025-12-04T12:54:26.2340843Z GITHUB_RUN_ATTEMPT=1 2025-12-04T12:54:26.2340954Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T12:54:26.2341090Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T12:54:26.2341229Z TERM=vt100 2025-12-04T12:54:26.2341324Z INSTALLED_VISION=yes 2025-12-04T12:54:26.2341432Z BRANCH=main 2025-12-04T12:54:26.2341534Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T12:54:26.2341650Z TESTS_TO_INCLUDE= 2025-12-04T12:54:26.2341814Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T12:54:26.2342013Z GITHUB_SERVER_URL=https://github.com 2025-12-04T12:54:26.2342155Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T12:54:26.2342309Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T12:54:26.2342443Z REENABLED_ISSUES= 2025-12-04T12:54:26.2342545Z SHLVL=1 2025-12-04T12:54:26.2342637Z MAX_JOBS=126 2025-12-04T12:54:26.2342771Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T12:54:26.2342929Z GITHUB_ACTOR_ID=97764156 2025-12-04T12:54:26.2343052Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T12:54:26.2343216Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2343372Z GITHUB_REF_NAME=main 2025-12-04T12:54:26.2343483Z ROCM_PATH=/opt/rocm 2025-12-04T12:54:26.2343587Z GITHUB_JOB=test 2025-12-04T12:54:26.2343689Z NO_TEST_TIMEOUT=False 2025-12-04T12:54:26.2343805Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T12:54:26.2343928Z LC_ALL=C.UTF-8 2025-12-04T12:54:26.2344030Z GITHUB_RETENTION_DAYS=90 2025-12-04T12:54:26.2344195Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T12:54:26.2344329Z OPENSSL_DIR=/opt/openssl 2025-12-04T12:54:26.2344445Z GITHUB_ACTION_REPOSITORY= 2025-12-04T12:54:26.2344807Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T12:54:26.2345163Z GITHUB_BASE_REF= 2025-12-04T12:54:26.2345264Z CI=true 2025-12-04T12:54:26.2345361Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T12:54:26.2345473Z JOB_ID=57116213181 2025-12-04T12:54:26.2345564Z GITHUB_HEAD_REF= 2025-12-04T12:54:26.2345657Z GITHUB_ACTION_REF= 2025-12-04T12:54:26.2345789Z TEST_SHOWLOCALS=False 2025-12-04T12:54:26.2345899Z GITHUB_WORKFLOW=trunk-rocm-mi300 2025-12-04T12:54:26.2346020Z DEBIAN_FRONTEND=noninteractive 2025-12-04T12:54:26.2346226Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2346437Z NO_TD=False 2025-12-04T12:54:26.2346528Z OLDPWD=/var/lib/jenkins 2025-12-04T12:54:26.2346631Z _=/usr/bin/env 2025-12-04T12:54:26.2346758Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-12-04T12:54:26.2396605Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-12-04T12:54:26.2396845Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-12-04T12:54:26.2397059Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-12-04T12:54:26.2397278Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-12-04T12:54:26.2397470Z + BUILD_DIR=build 2025-12-04T12:54:26.2397585Z + BUILD_RENAMED_DIR=build_renamed 2025-12-04T12:54:26.2397710Z + BUILD_BIN_DIR=build/bin 2025-12-04T12:54:26.2397822Z + SHARD_NUMBER=3 2025-12-04T12:54:26.2397922Z + NUM_TEST_SHARDS=3 2025-12-04T12:54:26.2398767Z + export TORCH_SERIALIZATION_DEBUG=1 2025-12-04T12:54:26.2398922Z + TORCH_SERIALIZATION_DEBUG=1 2025-12-04T12:54:26.2399039Z + export VALGRIND=ON 2025-12-04T12:54:26.2399144Z + VALGRIND=ON 2025-12-04T12:54:26.2399257Z + [[ linux-jammy-rocm-py3.10 == *clang9* ]] 2025-12-04T12:54:26.2399399Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-12-04T12:54:26.2399526Z + detect_cuda_arch 2025-12-04T12:54:26.2399634Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T12:54:26.2399771Z + [[ linux-jammy-rocm-py3.10 == *s390x* ]] 2025-12-04T12:54:26.2399896Z + [[ 0 == \1 ]] 2025-12-04T12:54:26.2399994Z + [[ True == \1 ]] 2025-12-04T12:54:26.2400103Z + [[ linux-jammy-rocm-py3.10 != *bazel* ]] 2025-12-04T12:54:26.2402300Z ++ realpath build/custom_test_artifacts 2025-12-04T12:54:26.2411951Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts 2025-12-04T12:54:26.2412294Z + [[ -n '' ]] 2025-12-04T12:54:26.2412413Z + echo 'Environment variables' 2025-12-04T12:54:26.2412537Z Environment variables 2025-12-04T12:54:26.2412647Z + env 2025-12-04T12:54:26.2425718Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T12:54:26.2425899Z CONTINUE_THROUGH_ERROR=True 2025-12-04T12:54:26.2426032Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-12-04T12:54:26.2426203Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rpncb 2025-12-04T12:54:26.2426449Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2426656Z GITHUB_ACTION=__run_2 2025-12-04T12:54:26.2426774Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T12:54:26.2426896Z GITHUB_RUN_NUMBER=689 2025-12-04T12:54:26.2427003Z TEST_CONFIG=distributed 2025-12-04T12:54:26.2427149Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rpncb 2025-12-04T12:54:26.2427310Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T12:54:26.2427437Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T12:54:26.2427578Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T12:54:26.2427729Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T12:54:26.2427858Z GITHUB_REF_TYPE=branch 2025-12-04T12:54:26.2428032Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2428212Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T12:54:26.2428335Z *** 2025-12-04T12:54:26.2428432Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T12:54:26.2428549Z GITHUB_ACTIONS=true 2025-12-04T12:54:26.2428670Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2428823Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2429043Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk-rocm-mi300.yml@refs/heads/main 2025-12-04T12:54:26.2429237Z UCC_HOME=/usr 2025-12-04T12:54:26.2429339Z TORCH_SERIALIZATION_DEBUG=1 2025-12-04T12:54:26.2429454Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T12:54:26.2429608Z VERBOSE_TEST_LOGS=False 2025-12-04T12:54:26.2429720Z GITHUB_REF=refs/heads/main 2025-12-04T12:54:26.2429829Z RUNNER_OS=Linux 2025-12-04T12:54:26.2429928Z SHARD_NUMBER=3 2025-12-04T12:54:26.2430032Z GITHUB_REF_PROTECTED=true 2025-12-04T12:54:26.2430151Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T12:54:26.2430310Z HOME=/var/lib/jenkins 2025-12-04T12:54:26.2430429Z GITHUB_API_URL=https://api.github.com 2025-12-04T12:54:26.2430566Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T12:54:26.2430785Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T12:54:26.2430916Z LANG=C.UTF-8 2025-12-04T12:54:26.2431036Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T12:54:26.2431181Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T12:54:26.2431327Z RUNNER_TRACKING_ID=github_b361d030-1dc8-4630-8751-535f40dec7b0 2025-12-04T12:54:26.2431477Z RUNNER_ARCH=X64 2025-12-04T12:54:26.2431585Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T12:54:26.2431709Z NUM_TEST_SHARDS=3 2025-12-04T12:54:26.2431813Z UCX_HOME=/usr 2025-12-04T12:54:26.2431999Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2432305Z JOB_NAME=linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T12:54:26.2432528Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T12:54:26.2432723Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2432971Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T12:54:26.2433139Z GITHUB_EVENT_NAME=schedule 2025-12-04T12:54:26.2433299Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T12:54:26.2433467Z DASHBOARD_TAG= 2025-12-04T12:54:26.2433567Z GITHUB_RUN_ID=19922849170 2025-12-04T12:54:26.2433778Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2434014Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T12:54:26.2434128Z PR_NUMBER= 2025-12-04T12:54:26.2434221Z GITHUB_RUN_ATTEMPT=1 2025-12-04T12:54:26.2434324Z VALGRIND=ON 2025-12-04T12:54:26.2434425Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T12:54:26.2434562Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T12:54:26.2434702Z TERM=vt100 2025-12-04T12:54:26.2434798Z INSTALLED_VISION=yes 2025-12-04T12:54:26.2434905Z BRANCH=main 2025-12-04T12:54:26.2435010Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T12:54:26.2435126Z TESTS_TO_INCLUDE= 2025-12-04T12:54:26.2435289Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T12:54:26.2435480Z GITHUB_SERVER_URL=https://github.com 2025-12-04T12:54:26.2435621Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T12:54:26.2435775Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T12:54:26.2435914Z REENABLED_ISSUES= 2025-12-04T12:54:26.2436014Z SHLVL=1 2025-12-04T12:54:26.2436105Z MAX_JOBS=126 2025-12-04T12:54:26.2436243Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T12:54:26.2436399Z GITHUB_ACTOR_ID=97764156 2025-12-04T12:54:26.2436518Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T12:54:26.2436681Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:54:26.2436835Z GITHUB_REF_NAME=main 2025-12-04T12:54:26.2436998Z ROCM_PATH=/opt/rocm 2025-12-04T12:54:26.2437100Z GITHUB_JOB=test 2025-12-04T12:54:26.2437200Z NO_TEST_TIMEOUT=False 2025-12-04T12:54:26.2437314Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T12:54:26.2437434Z LC_ALL=C.UTF-8 2025-12-04T12:54:26.2437534Z GITHUB_RETENTION_DAYS=90 2025-12-04T12:54:26.2437660Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T12:54:26.2437793Z OPENSSL_DIR=/opt/openssl 2025-12-04T12:54:26.2437906Z GITHUB_ACTION_REPOSITORY= 2025-12-04T12:54:26.2438294Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T12:54:26.2438647Z GITHUB_BASE_REF= 2025-12-04T12:54:26.2438745Z CI=true 2025-12-04T12:54:26.2438845Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T12:54:26.2438965Z JOB_ID=57116213181 2025-12-04T12:54:26.2439065Z GITHUB_HEAD_REF= 2025-12-04T12:54:26.2439165Z GITHUB_ACTION_REF= 2025-12-04T12:54:26.2439274Z TEST_SHOWLOCALS=False 2025-12-04T12:54:26.2439388Z GITHUB_WORKFLOW=trunk-rocm-mi300 2025-12-04T12:54:26.2439516Z DEBIAN_FRONTEND=noninteractive 2025-12-04T12:54:26.2439731Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_5c4ab8ec-f849-4d98-a8e1-431723f66272 2025-12-04T12:54:26.2439944Z NO_TD=False 2025-12-04T12:54:26.2440040Z OLDPWD=/var/lib/jenkins 2025-12-04T12:54:26.2440146Z _=/usr/bin/env 2025-12-04T12:54:26.2440338Z + echo 'Testing pytorch' 2025-12-04T12:54:26.2440446Z Testing pytorch 2025-12-04T12:54:26.2440545Z + export LANG=C.UTF-8 2025-12-04T12:54:26.2440648Z + LANG=C.UTF-8 2025-12-04T12:54:26.2440743Z + PR_NUMBER= 2025-12-04T12:54:26.2440854Z + [[ distributed == \d\e\f\a\u\l\t ]] 2025-12-04T12:54:26.2440988Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-12-04T12:54:26.2441126Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T12:54:26.2441262Z + export HIP_VISIBLE_DEVICES=0,1,2,3 2025-12-04T12:54:26.2441388Z + HIP_VISIBLE_DEVICES=0,1,2,3 2025-12-04T12:54:26.2441513Z + [[ distributed == \s\l\o\w ]] 2025-12-04T12:54:26.2441651Z + [[ linux-jammy-rocm-py3.10 == *slow-gradcheck* ]] 2025-12-04T12:54:26.2441799Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T12:54:26.2441933Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T12:54:26.2442074Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T12:54:26.2442217Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T12:54:26.2442347Z + [[ distributed == *crossref* ]] 2025-12-04T12:54:26.2442558Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T12:54:26.2442685Z + export VALGRIND=OFF 2025-12-04T12:54:26.2442785Z + VALGRIND=OFF 2025-12-04T12:54:26.2442880Z + rocminfo 2025-12-04T12:54:26.2552186Z ROCk module version 6.12.12 is loaded 2025-12-04T12:54:26.3289452Z ===================== 2025-12-04T12:54:26.3289871Z HSA System Attributes 2025-12-04T12:54:26.3290246Z ===================== 2025-12-04T12:54:26.3290550Z Runtime Version: 1.18 2025-12-04T12:54:26.3290891Z Runtime Ext Version: 1.14 2025-12-04T12:54:26.3291238Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T12:54:26.3291793Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T12:54:26.3292348Z Machine Model: LARGE 2025-12-04T12:54:26.3292802Z System Endianness: LITTLE 2025-12-04T12:54:26.3293194Z Mwaitx: DISABLED 2025-12-04T12:54:26.3293510Z XNACK enabled: NO 2025-12-04T12:54:26.3293818Z DMAbuf Support: YES 2025-12-04T12:54:26.3294136Z VMM Support: YES 2025-12-04T12:54:26.3294344Z 2025-12-04T12:54:26.3294454Z ========== 2025-12-04T12:54:26.3294747Z HSA Agents 2025-12-04T12:54:26.3295024Z ========== 2025-12-04T12:54:26.3295297Z ******* 2025-12-04T12:54:26.3295566Z Agent 1 2025-12-04T12:54:26.3295832Z ******* 2025-12-04T12:54:26.3296327Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:54:26.3296743Z Uuid: CPU-XX 2025-12-04T12:54:26.3297183Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:54:26.3297632Z Vendor Name: CPU 2025-12-04T12:54:26.3298052Z Feature: None specified 2025-12-04T12:54:26.3298473Z Profile: FULL_PROFILE 2025-12-04T12:54:26.3298905Z Float Round Mode: NEAR 2025-12-04T12:54:26.3299342Z Max Queue Number: 0(0x0) 2025-12-04T12:54:26.3299709Z Queue Min Size: 0(0x0) 2025-12-04T12:54:26.3299870Z Queue Max Size: 0(0x0) 2025-12-04T12:54:26.3300028Z Queue Type: MULTI 2025-12-04T12:54:26.3300219Z Node: 0 2025-12-04T12:54:26.3300380Z Device Type: CPU 2025-12-04T12:54:26.3300523Z Cache Info: 2025-12-04T12:54:26.3300651Z L1: 49152(0xc000) KB 2025-12-04T12:54:26.3300800Z Chip ID: 0(0x0) 2025-12-04T12:54:26.3300954Z ASIC Revision: 0(0x0) 2025-12-04T12:54:26.3301115Z Cacheline Size: 64(0x40) 2025-12-04T12:54:26.3301275Z Max Clock Freq. (MHz): 3300 2025-12-04T12:54:26.3301435Z BDFID: 0 2025-12-04T12:54:26.3301591Z Internal Node ID: 0 2025-12-04T12:54:26.3301757Z Compute Unit: 64 2025-12-04T12:54:26.3301915Z SIMDs per CU: 0 2025-12-04T12:54:26.3302079Z Shader Engines: 0 2025-12-04T12:54:26.3302245Z Shader Arrs. per Eng.: 0 2025-12-04T12:54:26.3302415Z WatchPts on Addr. Ranges:1 2025-12-04T12:54:26.3302570Z Memory Properties: 2025-12-04T12:54:26.3302689Z Features: None 2025-12-04T12:54:26.3302811Z Pool Info: 2025-12-04T12:54:26.3302927Z Pool 1 2025-12-04T12:54:26.3303068Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:54:26.3303230Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:54:26.3303394Z Allocatable: TRUE 2025-12-04T12:54:26.3303561Z Alloc Granule: 4KB 2025-12-04T12:54:26.3303846Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3304023Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3304191Z Accessible by all: TRUE 2025-12-04T12:54:26.3304338Z Pool 2 2025-12-04T12:54:26.3304476Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:54:26.3304635Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:54:26.3304790Z Allocatable: TRUE 2025-12-04T12:54:26.3304957Z Alloc Granule: 4KB 2025-12-04T12:54:26.3305127Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3305299Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3305466Z Accessible by all: TRUE 2025-12-04T12:54:26.3305619Z Pool 3 2025-12-04T12:54:26.3305757Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T12:54:26.3305946Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:54:26.3306102Z Allocatable: TRUE 2025-12-04T12:54:26.3306269Z Alloc Granule: 4KB 2025-12-04T12:54:26.3306439Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3306610Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3306776Z Accessible by all: TRUE 2025-12-04T12:54:26.3306925Z Pool 4 2025-12-04T12:54:26.3307095Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:54:26.3307251Z Size: 1584733176(0x5e751bf8) KB 2025-12-04T12:54:26.3307406Z Allocatable: TRUE 2025-12-04T12:54:26.3307572Z Alloc Granule: 4KB 2025-12-04T12:54:26.3307746Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3307919Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3308087Z Accessible by all: TRUE 2025-12-04T12:54:26.3308236Z ISA Info: 2025-12-04T12:54:26.3308349Z ******* 2025-12-04T12:54:26.3308461Z Agent 2 2025-12-04T12:54:26.3308567Z ******* 2025-12-04T12:54:26.3308695Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:54:26.3308849Z Uuid: CPU-XX 2025-12-04T12:54:26.3309011Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:54:26.3309179Z Vendor Name: CPU 2025-12-04T12:54:26.3309340Z Feature: None specified 2025-12-04T12:54:26.3309506Z Profile: FULL_PROFILE 2025-12-04T12:54:26.3309668Z Float Round Mode: NEAR 2025-12-04T12:54:26.3309831Z Max Queue Number: 0(0x0) 2025-12-04T12:54:26.3309991Z Queue Min Size: 0(0x0) 2025-12-04T12:54:26.3310148Z Queue Max Size: 0(0x0) 2025-12-04T12:54:26.3310376Z Queue Type: MULTI 2025-12-04T12:54:26.3310525Z Node: 1 2025-12-04T12:54:26.3310674Z Device Type: CPU 2025-12-04T12:54:26.3310818Z Cache Info: 2025-12-04T12:54:26.3310942Z L1: 49152(0xc000) KB 2025-12-04T12:54:26.3311086Z Chip ID: 0(0x0) 2025-12-04T12:54:26.3311240Z ASIC Revision: 0(0x0) 2025-12-04T12:54:26.3311404Z Cacheline Size: 64(0x40) 2025-12-04T12:54:26.3311567Z Max Clock Freq. (MHz): 3300 2025-12-04T12:54:26.3311720Z BDFID: 0 2025-12-04T12:54:26.3311876Z Internal Node ID: 1 2025-12-04T12:54:26.3312038Z Compute Unit: 64 2025-12-04T12:54:26.3312197Z SIMDs per CU: 0 2025-12-04T12:54:26.3312355Z Shader Engines: 0 2025-12-04T12:54:26.3312525Z Shader Arrs. per Eng.: 0 2025-12-04T12:54:26.3312692Z WatchPts on Addr. Ranges:1 2025-12-04T12:54:26.3312842Z Memory Properties: 2025-12-04T12:54:26.3312960Z Features: None 2025-12-04T12:54:26.3313080Z Pool Info: 2025-12-04T12:54:26.3313259Z Pool 1 2025-12-04T12:54:26.3313399Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:54:26.3313561Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:54:26.3313719Z Allocatable: TRUE 2025-12-04T12:54:26.3313883Z Alloc Granule: 4KB 2025-12-04T12:54:26.3314053Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3314223Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3314393Z Accessible by all: TRUE 2025-12-04T12:54:26.3314576Z Pool 2 2025-12-04T12:54:26.3314716Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:54:26.3314875Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:54:26.3315033Z Allocatable: TRUE 2025-12-04T12:54:26.3315204Z Alloc Granule: 4KB 2025-12-04T12:54:26.3315375Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3315544Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3315711Z Accessible by all: TRUE 2025-12-04T12:54:26.3315860Z Pool 3 2025-12-04T12:54:26.3316000Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T12:54:26.3316157Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:54:26.3316318Z Allocatable: TRUE 2025-12-04T12:54:26.3316480Z Alloc Granule: 4KB 2025-12-04T12:54:26.3316649Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3316819Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3316988Z Accessible by all: TRUE 2025-12-04T12:54:26.3317128Z Pool 4 2025-12-04T12:54:26.3317263Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:54:26.3317422Z Size: 1585355624(0x5e7e9b68) KB 2025-12-04T12:54:26.3317575Z Allocatable: TRUE 2025-12-04T12:54:26.3317737Z Alloc Granule: 4KB 2025-12-04T12:54:26.3317904Z Alloc Recommended Granule:4KB 2025-12-04T12:54:26.3318077Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3318243Z Accessible by all: TRUE 2025-12-04T12:54:26.3318389Z ISA Info: 2025-12-04T12:54:26.3318500Z ******* 2025-12-04T12:54:26.3318613Z Agent 3 2025-12-04T12:54:26.3318720Z ******* 2025-12-04T12:54:26.3318842Z Name: gfx942 2025-12-04T12:54:26.3318994Z Uuid: GPU-73d95c9754364571 2025-12-04T12:54:26.3319152Z Marketing Name: 2025-12-04T12:54:26.3319313Z Vendor Name: AMD 2025-12-04T12:54:26.3319472Z Feature: KERNEL_DISPATCH 2025-12-04T12:54:26.3319631Z Profile: BASE_PROFILE 2025-12-04T12:54:26.3319794Z Float Round Mode: NEAR 2025-12-04T12:54:26.3319959Z Max Queue Number: 128(0x80) 2025-12-04T12:54:26.3320119Z Queue Min Size: 64(0x40) 2025-12-04T12:54:26.3320330Z Queue Max Size: 131072(0x20000) 2025-12-04T12:54:26.3320544Z Queue Type: MULTI 2025-12-04T12:54:26.3320698Z Node: 2 2025-12-04T12:54:26.3320849Z Device Type: GPU 2025-12-04T12:54:26.3320990Z Cache Info: 2025-12-04T12:54:26.3321116Z L1: 32(0x20) KB 2025-12-04T12:54:26.3321258Z L2: 4096(0x1000) KB 2025-12-04T12:54:26.3321395Z L3: 262144(0x40000) KB 2025-12-04T12:54:26.3321539Z Chip ID: 29861(0x74a5) 2025-12-04T12:54:26.3321730Z ASIC Revision: 1(0x1) 2025-12-04T12:54:26.3321893Z Cacheline Size: 128(0x80) 2025-12-04T12:54:26.3322055Z Max Clock Freq. (MHz): 2100 2025-12-04T12:54:26.3322207Z BDFID: 29952 2025-12-04T12:54:26.3322364Z Internal Node ID: 2 2025-12-04T12:54:26.3322525Z Compute Unit: 304 2025-12-04T12:54:26.3322693Z SIMDs per CU: 4 2025-12-04T12:54:26.3322852Z Shader Engines: 32 2025-12-04T12:54:26.3323017Z Shader Arrs. per Eng.: 1 2025-12-04T12:54:26.3323185Z WatchPts on Addr. Ranges:4 2025-12-04T12:54:26.3323356Z Coherent Host Access: FALSE 2025-12-04T12:54:26.3323511Z Memory Properties: 2025-12-04T12:54:26.3323641Z Features: KERNEL_DISPATCH 2025-12-04T12:54:26.3323794Z Fast F16 Operation: TRUE 2025-12-04T12:54:26.3323956Z Wavefront Size: 64(0x40) 2025-12-04T12:54:26.3324119Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3324275Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3324410Z x 1024(0x400) 2025-12-04T12:54:26.3324548Z y 1024(0x400) 2025-12-04T12:54:26.3324681Z z 1024(0x400) 2025-12-04T12:54:26.3324829Z Max Waves Per CU: 32(0x20) 2025-12-04T12:54:26.3324995Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:54:26.3325158Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3325313Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3325439Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3325579Z y 65535(0xffff) 2025-12-04T12:54:26.3325717Z z 65535(0xffff) 2025-12-04T12:54:26.3325878Z Max fbarriers/Workgrp: 32 2025-12-04T12:54:26.3326099Z Packet Processor uCode:: 185 2025-12-04T12:54:26.3326267Z SDMA engine uCode:: 24 2025-12-04T12:54:26.3326463Z IOMMU Support:: None 2025-12-04T12:54:26.3326601Z Pool Info: 2025-12-04T12:54:26.3326723Z Pool 1 2025-12-04T12:54:26.3326867Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:54:26.3327051Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3327258Z Allocatable: TRUE 2025-12-04T12:54:26.3327423Z Alloc Granule: 4KB 2025-12-04T12:54:26.3327663Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3327864Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3328192Z Accessible by all: FALSE 2025-12-04T12:54:26.3328363Z Pool 2 2025-12-04T12:54:26.3339865Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:54:26.3340038Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3340252Z Allocatable: TRUE 2025-12-04T12:54:26.3340419Z Alloc Granule: 4KB 2025-12-04T12:54:26.3340594Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3340864Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3341041Z Accessible by all: FALSE 2025-12-04T12:54:26.3341194Z Pool 3 2025-12-04T12:54:26.3341335Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:54:26.3341504Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3341664Z Allocatable: TRUE 2025-12-04T12:54:26.3341831Z Alloc Granule: 4KB 2025-12-04T12:54:26.3342007Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3342186Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3342360Z Accessible by all: FALSE 2025-12-04T12:54:26.3342512Z Pool 4 2025-12-04T12:54:26.3342649Z Segment: GROUP 2025-12-04T12:54:26.3342809Z Size: 64(0x40) KB 2025-12-04T12:54:26.3342967Z Allocatable: FALSE 2025-12-04T12:54:26.3343134Z Alloc Granule: 0KB 2025-12-04T12:54:26.3343312Z Alloc Recommended Granule:0KB 2025-12-04T12:54:26.3343486Z Alloc Alignment: 0KB 2025-12-04T12:54:26.3343657Z Accessible by all: FALSE 2025-12-04T12:54:26.3343808Z ISA Info: 2025-12-04T12:54:26.3343930Z ISA 1 2025-12-04T12:54:26.3344076Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:54:26.3344255Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3344429Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3344606Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3344783Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3344948Z Fast f16: TRUE 2025-12-04T12:54:26.3345115Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3345278Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3345422Z x 1024(0x400) 2025-12-04T12:54:26.3345569Z y 1024(0x400) 2025-12-04T12:54:26.3345706Z z 1024(0x400) 2025-12-04T12:54:26.3345860Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3346015Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3346151Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3346300Z y 65535(0xffff) 2025-12-04T12:54:26.3346443Z z 65535(0xffff) 2025-12-04T12:54:26.3346601Z FBarrier Max Size: 32 2025-12-04T12:54:26.3346750Z ISA 2 2025-12-04T12:54:26.3346962Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:54:26.3347150Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3347325Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3347496Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3347672Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3347837Z Fast f16: TRUE 2025-12-04T12:54:26.3348005Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3348195Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3348336Z x 1024(0x400) 2025-12-04T12:54:26.3348478Z y 1024(0x400) 2025-12-04T12:54:26.3348620Z z 1024(0x400) 2025-12-04T12:54:26.3348776Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3348928Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3349061Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3349206Z y 65535(0xffff) 2025-12-04T12:54:26.3349350Z z 65535(0xffff) 2025-12-04T12:54:26.3349507Z FBarrier Max Size: 32 2025-12-04T12:54:26.3349656Z ******* 2025-12-04T12:54:26.3349770Z Agent 4 2025-12-04T12:54:26.3349881Z ******* 2025-12-04T12:54:26.3350011Z Name: gfx942 2025-12-04T12:54:26.3350203Z Uuid: GPU-0f8baf68cff7012d 2025-12-04T12:54:26.3350369Z Marketing Name: 2025-12-04T12:54:26.3350537Z Vendor Name: AMD 2025-12-04T12:54:26.3350700Z Feature: KERNEL_DISPATCH 2025-12-04T12:54:26.3350858Z Profile: BASE_PROFILE 2025-12-04T12:54:26.3351024Z Float Round Mode: NEAR 2025-12-04T12:54:26.3351191Z Max Queue Number: 128(0x80) 2025-12-04T12:54:26.3351349Z Queue Min Size: 64(0x40) 2025-12-04T12:54:26.3351511Z Queue Max Size: 131072(0x20000) 2025-12-04T12:54:26.3351673Z Queue Type: MULTI 2025-12-04T12:54:26.3351833Z Node: 3 2025-12-04T12:54:26.3351989Z Device Type: GPU 2025-12-04T12:54:26.3352132Z Cache Info: 2025-12-04T12:54:26.3352262Z L1: 32(0x20) KB 2025-12-04T12:54:26.3352405Z L2: 4096(0x1000) KB 2025-12-04T12:54:26.3352544Z L3: 262144(0x40000) KB 2025-12-04T12:54:26.3352688Z Chip ID: 29861(0x74a5) 2025-12-04T12:54:26.3352845Z ASIC Revision: 1(0x1) 2025-12-04T12:54:26.3353006Z Cacheline Size: 128(0x80) 2025-12-04T12:54:26.3353168Z Max Clock Freq. (MHz): 2100 2025-12-04T12:54:26.3353320Z BDFID: 1280 2025-12-04T12:54:26.3353481Z Internal Node ID: 3 2025-12-04T12:54:26.3353641Z Compute Unit: 304 2025-12-04T12:54:26.3353800Z SIMDs per CU: 4 2025-12-04T12:54:26.3353960Z Shader Engines: 32 2025-12-04T12:54:26.3354171Z Shader Arrs. per Eng.: 1 2025-12-04T12:54:26.3354339Z WatchPts on Addr. Ranges:4 2025-12-04T12:54:26.3354502Z Coherent Host Access: FALSE 2025-12-04T12:54:26.3354651Z Memory Properties: 2025-12-04T12:54:26.3354777Z Features: KERNEL_DISPATCH 2025-12-04T12:54:26.3354928Z Fast F16 Operation: TRUE 2025-12-04T12:54:26.3355087Z Wavefront Size: 64(0x40) 2025-12-04T12:54:26.3355250Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3355433Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3355557Z x 1024(0x400) 2025-12-04T12:54:26.3355691Z y 1024(0x400) 2025-12-04T12:54:26.3355824Z z 1024(0x400) 2025-12-04T12:54:26.3355976Z Max Waves Per CU: 32(0x20) 2025-12-04T12:54:26.3356139Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:54:26.3356300Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3356446Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3356572Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3356705Z y 65535(0xffff) 2025-12-04T12:54:26.3356832Z z 65535(0xffff) 2025-12-04T12:54:26.3356981Z Max fbarriers/Workgrp: 32 2025-12-04T12:54:26.3357151Z Packet Processor uCode:: 185 2025-12-04T12:54:26.3357319Z SDMA engine uCode:: 24 2025-12-04T12:54:26.3357482Z IOMMU Support:: None 2025-12-04T12:54:26.3357631Z Pool Info: 2025-12-04T12:54:26.3357743Z Pool 1 2025-12-04T12:54:26.3357881Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:54:26.3358043Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3358201Z Allocatable: TRUE 2025-12-04T12:54:26.3358364Z Alloc Granule: 4KB 2025-12-04T12:54:26.3358533Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3358694Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3358863Z Accessible by all: FALSE 2025-12-04T12:54:26.3359010Z Pool 2 2025-12-04T12:54:26.3359149Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:54:26.3359306Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3359463Z Allocatable: TRUE 2025-12-04T12:54:26.3359626Z Alloc Granule: 4KB 2025-12-04T12:54:26.3359797Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3359967Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3360132Z Accessible by all: FALSE 2025-12-04T12:54:26.3360313Z Pool 3 2025-12-04T12:54:26.3360451Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:54:26.3360608Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3360758Z Allocatable: TRUE 2025-12-04T12:54:26.3360919Z Alloc Granule: 4KB 2025-12-04T12:54:26.3361089Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3361297Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3361465Z Accessible by all: FALSE 2025-12-04T12:54:26.3361610Z Pool 4 2025-12-04T12:54:26.3361741Z Segment: GROUP 2025-12-04T12:54:26.3361890Z Size: 64(0x40) KB 2025-12-04T12:54:26.3362044Z Allocatable: FALSE 2025-12-04T12:54:26.3362207Z Alloc Granule: 0KB 2025-12-04T12:54:26.3362412Z Alloc Recommended Granule:0KB 2025-12-04T12:54:26.3362584Z Alloc Alignment: 0KB 2025-12-04T12:54:26.3362742Z Accessible by all: FALSE 2025-12-04T12:54:26.3362887Z ISA Info: 2025-12-04T12:54:26.3362996Z ISA 1 2025-12-04T12:54:26.3363133Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:54:26.3363299Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3363462Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3363621Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3363786Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3363937Z Fast f16: TRUE 2025-12-04T12:54:26.3364092Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3364244Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3364373Z x 1024(0x400) 2025-12-04T12:54:26.3364507Z y 1024(0x400) 2025-12-04T12:54:26.3364640Z z 1024(0x400) 2025-12-04T12:54:26.3364791Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3364932Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3365053Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3365186Z y 65535(0xffff) 2025-12-04T12:54:26.3365320Z z 65535(0xffff) 2025-12-04T12:54:26.3365466Z FBarrier Max Size: 32 2025-12-04T12:54:26.3365603Z ISA 2 2025-12-04T12:54:26.3365752Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:54:26.3365931Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3366093Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3366252Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3366421Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3366577Z Fast f16: TRUE 2025-12-04T12:54:26.3366731Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3366877Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3367007Z x 1024(0x400) 2025-12-04T12:54:26.3367141Z y 1024(0x400) 2025-12-04T12:54:26.3367271Z z 1024(0x400) 2025-12-04T12:54:26.3367417Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3367556Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3367679Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3367811Z y 65535(0xffff) 2025-12-04T12:54:26.3367977Z z 65535(0xffff) 2025-12-04T12:54:26.3368123Z FBarrier Max Size: 32 2025-12-04T12:54:26.3368259Z ******* 2025-12-04T12:54:26.3368364Z Agent 5 2025-12-04T12:54:26.3368466Z ******* 2025-12-04T12:54:26.3368584Z Name: gfx942 2025-12-04T12:54:26.3368730Z Uuid: GPU-990a2a287a45ff0c 2025-12-04T12:54:26.3368884Z Marketing Name: 2025-12-04T12:54:26.3369040Z Vendor Name: AMD 2025-12-04T12:54:26.3369222Z Feature: KERNEL_DISPATCH 2025-12-04T12:54:26.3369378Z Profile: BASE_PROFILE 2025-12-04T12:54:26.3369533Z Float Round Mode: NEAR 2025-12-04T12:54:26.3369694Z Max Queue Number: 128(0x80) 2025-12-04T12:54:26.3369848Z Queue Min Size: 64(0x40) 2025-12-04T12:54:26.3370000Z Queue Max Size: 131072(0x20000) 2025-12-04T12:54:26.3370150Z Queue Type: MULTI 2025-12-04T12:54:26.3370327Z Node: 4 2025-12-04T12:54:26.3370470Z Device Type: GPU 2025-12-04T12:54:26.3370604Z Cache Info: 2025-12-04T12:54:26.3370721Z L1: 32(0x20) KB 2025-12-04T12:54:26.3370860Z L2: 4096(0x1000) KB 2025-12-04T12:54:26.3370991Z L3: 262144(0x40000) KB 2025-12-04T12:54:26.3371129Z Chip ID: 29861(0x74a5) 2025-12-04T12:54:26.3371277Z ASIC Revision: 1(0x1) 2025-12-04T12:54:26.3371437Z Cacheline Size: 128(0x80) 2025-12-04T12:54:26.3371590Z Max Clock Freq. (MHz): 2100 2025-12-04T12:54:26.3371737Z BDFID: 25856 2025-12-04T12:54:26.3371885Z Internal Node ID: 4 2025-12-04T12:54:26.3372042Z Compute Unit: 304 2025-12-04T12:54:26.3372192Z SIMDs per CU: 4 2025-12-04T12:54:26.3372346Z Shader Engines: 32 2025-12-04T12:54:26.3372506Z Shader Arrs. per Eng.: 1 2025-12-04T12:54:26.3372670Z WatchPts on Addr. Ranges:4 2025-12-04T12:54:26.3372833Z Coherent Host Access: FALSE 2025-12-04T12:54:26.3372978Z Memory Properties: 2025-12-04T12:54:26.3373098Z Features: KERNEL_DISPATCH 2025-12-04T12:54:26.3373242Z Fast F16 Operation: TRUE 2025-12-04T12:54:26.3373401Z Wavefront Size: 64(0x40) 2025-12-04T12:54:26.3373559Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3373705Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3373832Z x 1024(0x400) 2025-12-04T12:54:26.3373963Z y 1024(0x400) 2025-12-04T12:54:26.3374087Z z 1024(0x400) 2025-12-04T12:54:26.3374232Z Max Waves Per CU: 32(0x20) 2025-12-04T12:54:26.3374388Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:54:26.3374543Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3374683Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3374847Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3374978Z y 65535(0xffff) 2025-12-04T12:54:26.3375107Z z 65535(0xffff) 2025-12-04T12:54:26.3375254Z Max fbarriers/Workgrp: 32 2025-12-04T12:54:26.3375420Z Packet Processor uCode:: 185 2025-12-04T12:54:26.3375583Z SDMA engine uCode:: 24 2025-12-04T12:54:26.3375741Z IOMMU Support:: None 2025-12-04T12:54:26.3375874Z Pool Info: 2025-12-04T12:54:26.3376077Z Pool 1 2025-12-04T12:54:26.3376212Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:54:26.3376367Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3376520Z Allocatable: TRUE 2025-12-04T12:54:26.3376683Z Alloc Granule: 4KB 2025-12-04T12:54:26.3376847Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3377011Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3377173Z Accessible by all: FALSE 2025-12-04T12:54:26.3377314Z Pool 2 2025-12-04T12:54:26.3377444Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:54:26.3377595Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3377748Z Allocatable: TRUE 2025-12-04T12:54:26.3377902Z Alloc Granule: 4KB 2025-12-04T12:54:26.3378065Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3378228Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3378391Z Accessible by all: FALSE 2025-12-04T12:54:26.3378531Z Pool 3 2025-12-04T12:54:26.3378660Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:54:26.3378809Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3378957Z Allocatable: TRUE 2025-12-04T12:54:26.3379112Z Alloc Granule: 4KB 2025-12-04T12:54:26.3379274Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3379438Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3379598Z Accessible by all: FALSE 2025-12-04T12:54:26.3379734Z Pool 4 2025-12-04T12:54:26.3379861Z Segment: GROUP 2025-12-04T12:54:26.3380006Z Size: 64(0x40) KB 2025-12-04T12:54:26.3380150Z Allocatable: FALSE 2025-12-04T12:54:26.3380344Z Alloc Granule: 0KB 2025-12-04T12:54:26.3380504Z Alloc Recommended Granule:0KB 2025-12-04T12:54:26.3380665Z Alloc Alignment: 0KB 2025-12-04T12:54:26.3380822Z Accessible by all: FALSE 2025-12-04T12:54:26.3380960Z ISA Info: 2025-12-04T12:54:26.3381064Z ISA 1 2025-12-04T12:54:26.3381198Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:54:26.3381362Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3381526Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3381687Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3381889Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3382042Z Fast f16: TRUE 2025-12-04T12:54:26.3382194Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3382339Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3382469Z x 1024(0x400) 2025-12-04T12:54:26.3382602Z y 1024(0x400) 2025-12-04T12:54:26.3382731Z z 1024(0x400) 2025-12-04T12:54:26.3382912Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3383053Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3383174Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3383305Z y 65535(0xffff) 2025-12-04T12:54:26.3383440Z z 65535(0xffff) 2025-12-04T12:54:26.3383585Z FBarrier Max Size: 32 2025-12-04T12:54:26.3383722Z ISA 2 2025-12-04T12:54:26.3383861Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:54:26.3384038Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3384197Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3384357Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3384528Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3384680Z Fast f16: TRUE 2025-12-04T12:54:26.3384833Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3384979Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3385110Z x 1024(0x400) 2025-12-04T12:54:26.3385241Z y 1024(0x400) 2025-12-04T12:54:26.3385372Z z 1024(0x400) 2025-12-04T12:54:26.3385511Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3385647Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3385767Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3385899Z y 65535(0xffff) 2025-12-04T12:54:26.3386032Z z 65535(0xffff) 2025-12-04T12:54:26.3386176Z FBarrier Max Size: 32 2025-12-04T12:54:26.3386312Z ******* 2025-12-04T12:54:26.3386415Z Agent 6 2025-12-04T12:54:26.3386512Z ******* 2025-12-04T12:54:26.3386628Z Name: gfx942 2025-12-04T12:54:26.3386774Z Uuid: GPU-99c7863ef295feac 2025-12-04T12:54:26.3386927Z Marketing Name: 2025-12-04T12:54:26.3387081Z Vendor Name: AMD 2025-12-04T12:54:26.3387233Z Feature: KERNEL_DISPATCH 2025-12-04T12:54:26.3387386Z Profile: BASE_PROFILE 2025-12-04T12:54:26.3387538Z Float Round Mode: NEAR 2025-12-04T12:54:26.3387693Z Max Queue Number: 128(0x80) 2025-12-04T12:54:26.3387851Z Queue Min Size: 64(0x40) 2025-12-04T12:54:26.3388020Z Queue Max Size: 131072(0x20000) 2025-12-04T12:54:26.3388196Z Queue Type: MULTI 2025-12-04T12:54:26.3388373Z Node: 5 2025-12-04T12:54:26.3388522Z Device Type: GPU 2025-12-04T12:54:26.3388662Z Cache Info: 2025-12-04T12:54:26.3388783Z L1: 32(0x20) KB 2025-12-04T12:54:26.3388917Z L2: 4096(0x1000) KB 2025-12-04T12:54:26.3389055Z L3: 262144(0x40000) KB 2025-12-04T12:54:26.3389195Z Chip ID: 29861(0x74a5) 2025-12-04T12:54:26.3389347Z ASIC Revision: 1(0x1) 2025-12-04T12:54:26.3389529Z Cacheline Size: 128(0x80) 2025-12-04T12:54:26.3389690Z Max Clock Freq. (MHz): 2100 2025-12-04T12:54:26.3389840Z BDFID: 5376 2025-12-04T12:54:26.3389995Z Internal Node ID: 5 2025-12-04T12:54:26.3390156Z Compute Unit: 304 2025-12-04T12:54:26.3390358Z SIMDs per CU: 4 2025-12-04T12:54:26.3390515Z Shader Engines: 32 2025-12-04T12:54:26.3390673Z Shader Arrs. per Eng.: 1 2025-12-04T12:54:26.3390837Z WatchPts on Addr. Ranges:4 2025-12-04T12:54:26.3391002Z Coherent Host Access: FALSE 2025-12-04T12:54:26.3391149Z Memory Properties: 2025-12-04T12:54:26.3391271Z Features: KERNEL_DISPATCH 2025-12-04T12:54:26.3391422Z Fast F16 Operation: TRUE 2025-12-04T12:54:26.3391584Z Wavefront Size: 64(0x40) 2025-12-04T12:54:26.3391743Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3391896Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3392023Z x 1024(0x400) 2025-12-04T12:54:26.3392155Z y 1024(0x400) 2025-12-04T12:54:26.3392284Z z 1024(0x400) 2025-12-04T12:54:26.3392426Z Max Waves Per CU: 32(0x20) 2025-12-04T12:54:26.3392584Z Max Work-item Per CU: 2048(0x800) 2025-12-04T12:54:26.3392741Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3392883Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3393003Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3393134Z y 65535(0xffff) 2025-12-04T12:54:26.3393265Z z 65535(0xffff) 2025-12-04T12:54:26.3393413Z Max fbarriers/Workgrp: 32 2025-12-04T12:54:26.3393581Z Packet Processor uCode:: 185 2025-12-04T12:54:26.3393746Z SDMA engine uCode:: 24 2025-12-04T12:54:26.3393907Z IOMMU Support:: None 2025-12-04T12:54:26.3394046Z Pool Info: 2025-12-04T12:54:26.3394155Z Pool 1 2025-12-04T12:54:26.3394287Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T12:54:26.3394442Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3394594Z Allocatable: TRUE 2025-12-04T12:54:26.3394756Z Alloc Granule: 4KB 2025-12-04T12:54:26.3394921Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3395088Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3395250Z Accessible by all: FALSE 2025-12-04T12:54:26.3395440Z Pool 2 2025-12-04T12:54:26.3395575Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T12:54:26.3395727Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3395882Z Allocatable: TRUE 2025-12-04T12:54:26.3396042Z Alloc Granule: 4KB 2025-12-04T12:54:26.3396212Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3396379Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3396586Z Accessible by all: FALSE 2025-12-04T12:54:26.3396732Z Pool 3 2025-12-04T12:54:26.3396864Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T12:54:26.3397016Z Size: 268419072(0xfffc000) KB 2025-12-04T12:54:26.3397169Z Allocatable: TRUE 2025-12-04T12:54:26.3397327Z Alloc Granule: 4KB 2025-12-04T12:54:26.3397493Z Alloc Recommended Granule:2048KB 2025-12-04T12:54:26.3397656Z Alloc Alignment: 4KB 2025-12-04T12:54:26.3397816Z Accessible by all: FALSE 2025-12-04T12:54:26.3397953Z Pool 4 2025-12-04T12:54:26.3398078Z Segment: GROUP 2025-12-04T12:54:26.3398222Z Size: 64(0x40) KB 2025-12-04T12:54:26.3398378Z Allocatable: FALSE 2025-12-04T12:54:26.3398535Z Alloc Granule: 0KB 2025-12-04T12:54:26.3398700Z Alloc Recommended Granule:0KB 2025-12-04T12:54:26.3398868Z Alloc Alignment: 0KB 2025-12-04T12:54:26.3399030Z Accessible by all: FALSE 2025-12-04T12:54:26.3399172Z ISA Info: 2025-12-04T12:54:26.3399279Z ISA 1 2025-12-04T12:54:26.3399412Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T12:54:26.3399581Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3399744Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3399907Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3400078Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3400272Z Fast f16: TRUE 2025-12-04T12:54:26.3400430Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3400577Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3400710Z x 1024(0x400) 2025-12-04T12:54:26.3400847Z y 1024(0x400) 2025-12-04T12:54:26.3400981Z z 1024(0x400) 2025-12-04T12:54:26.3401128Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3401273Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3401396Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3401532Z y 65535(0xffff) 2025-12-04T12:54:26.3401667Z z 65535(0xffff) 2025-12-04T12:54:26.3401816Z FBarrier Max Size: 32 2025-12-04T12:54:26.3401952Z ISA 2 2025-12-04T12:54:26.3402092Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T12:54:26.3402309Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T12:54:26.3402472Z Profiles: HSA_PROFILE_BASE 2025-12-04T12:54:26.3402636Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3402802Z Default Rounding Mode: NEAR 2025-12-04T12:54:26.3402956Z Fast f16: TRUE 2025-12-04T12:54:26.3403112Z Workgroup Max Size: 1024(0x400) 2025-12-04T12:54:26.3403258Z Workgroup Max Size per Dimension: 2025-12-04T12:54:26.3403421Z x 1024(0x400) 2025-12-04T12:54:26.3403551Z y 1024(0x400) 2025-12-04T12:54:26.3403683Z z 1024(0x400) 2025-12-04T12:54:26.3403830Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T12:54:26.3403979Z Grid Max Size per Dimension: 2025-12-04T12:54:26.3404103Z x 2147483647(0x7fffffff) 2025-12-04T12:54:26.3404238Z y 65535(0xffff) 2025-12-04T12:54:26.3404373Z z 65535(0xffff) 2025-12-04T12:54:26.3404520Z FBarrier Max Size: 32 2025-12-04T12:54:26.3404660Z *** Done *** 2025-12-04T12:54:26.3404771Z + rocminfo 2025-12-04T12:54:26.3404876Z + grep -E 'Name:.*\sgfx|Marketing' 2025-12-04T12:54:26.4228599Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:54:26.4229165Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T12:54:26.4229613Z Name: gfx942 2025-12-04T12:54:26.4230023Z Marketing Name: 2025-12-04T12:54:26.4230485Z Name: gfx942 2025-12-04T12:54:26.4230856Z Marketing Name: 2025-12-04T12:54:26.4231256Z Name: gfx942 2025-12-04T12:54:26.4231648Z Marketing Name: 2025-12-04T12:54:26.4232037Z Name: gfx942 2025-12-04T12:54:26.4232431Z Marketing Name: 2025-12-04T12:54:26.4317990Z + MAYBE_ROCM=rocm/ 2025-12-04T12:54:26.4318139Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-12-04T12:54:26.4318294Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-12-04T12:54:26.4318433Z + pip_install ninja==1.10.2 2025-12-04T12:54:26.4318590Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-12-04T12:54:26.4318771Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-12-04T12:54:26.6238883Z Collecting ninja==1.10.2 2025-12-04T12:54:26.6491989Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-12-04T12:54:26.6571453Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-12-04T12:54:26.8216159Z Installing collected packages: ninja 2025-12-04T12:54:26.8216592Z Attempting uninstall: ninja 2025-12-04T12:54:26.8219931Z Found existing installation: ninja 1.11.1.4 2025-12-04T12:54:26.8228622Z Uninstalling ninja-1.11.1.4: 2025-12-04T12:54:26.8258006Z Successfully uninstalled ninja-1.11.1.4 2025-12-04T12:54:26.8365474Z Successfully installed ninja-1.10.2 2025-12-04T12:54:26.8699520Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T12:54:26.8700460Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T12:54:26.8701123Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-12-04T12:54:26.8701302Z + [[ linux-jammy-rocm-py3.10 == *asan* ]] 2025-12-04T12:54:26.8701477Z + [[ linux-jammy-rocm-py3.10 == *-debug* ]] 2025-12-04T12:54:26.8701638Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-12-04T12:54:26.8701864Z + echo 'We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass' 2025-12-04T12:54:26.8702142Z We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass 2025-12-04T12:54:26.8704976Z + cd test 2025-12-04T12:54:26.8705177Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-12-04T12:54:27.7125567Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-12-04T12:54:27.7125901Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-12-04T12:54:27.7126201Z + [[ distributed == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-12-04T12:54:27.7129329Z + DYNAMO_BENCHMARK_FLAGS=() 2025-12-04T12:54:27.7129532Z + [[ distributed == *pr_time_benchmarks* ]] 2025-12-04T12:54:27.7129749Z + [[ distributed == *dynamo_eager* ]] 2025-12-04T12:54:27.7129988Z + [[ distributed == *aot_eager* ]] 2025-12-04T12:54:27.7130346Z + [[ distributed == *aot_inductor* ]] 2025-12-04T12:54:27.7130559Z + [[ distributed == *max_autotune_inductor* ]] 2025-12-04T12:54:27.7130770Z + [[ distributed == *inductor* ]] 2025-12-04T12:54:27.7130968Z + [[ distributed == *dynamic* ]] 2025-12-04T12:54:27.7131166Z + [[ distributed == *cpu* ]] 2025-12-04T12:54:27.7131351Z + [[ distributed == *xpu* ]] 2025-12-04T12:54:27.7131583Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-12-04T12:54:27.7144387Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-12-04T12:54:27.7144593Z + [[ linux-jammy-rocm-py3.10 == *-bazel-* ]] 2025-12-04T12:54:27.7148369Z + cd test 2025-12-04T12:54:27.7148865Z + python -c 'import torch; print(torch.__config__.show())' 2025-12-04T12:54:28.4397151Z PyTorch built with: 2025-12-04T12:54:28.4397410Z - GCC 11.4 2025-12-04T12:54:28.4397581Z - C++ Version: 201703 2025-12-04T12:54:28.4397972Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T12:54:28.4398485Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T12:54:28.4398778Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T12:54:28.4399013Z - LAPACK is enabled (usually provided by MKL) 2025-12-04T12:54:28.4399230Z - NNPACK is enabled 2025-12-04T12:54:28.4399418Z - CPU capability usage: AVX512 2025-12-04T12:54:28.4399650Z - HIP Runtime 7.1.25424 2025-12-04T12:54:28.4399827Z - MIOpen 3.5.1 2025-12-04T12:54:28.4399986Z - Magma 2.9.0 2025-12-04T12:54:28.4402799Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=35b7a9a26c5923d98aebaa41a031dae21788a9ee, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_FBGEMM_GENAI -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-12-04T12:54:28.4405576Z 2025-12-04T12:54:28.6569365Z + cd test 2025-12-04T12:54:28.6569711Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-12-04T12:54:29.2931992Z ATen/Parallel: 2025-12-04T12:54:29.2932426Z at::get_num_threads() : 128 2025-12-04T12:54:29.2932777Z at::get_num_interop_threads() : 128 2025-12-04T12:54:29.2933138Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T12:54:29.2933462Z omp_get_max_threads() : 128 2025-12-04T12:54:29.2934067Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T12:54:29.2934658Z mkl_get_max_threads() : 128 2025-12-04T12:54:29.2935074Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T12:54:29.2936246Z std::thread::hardware_concurrency() : 128 2025-12-04T12:54:29.2936580Z Environment variables: 2025-12-04T12:54:29.2936860Z OMP_NUM_THREADS : [not set] 2025-12-04T12:54:29.2937146Z MKL_NUM_THREADS : [not set] 2025-12-04T12:54:29.2937449Z ATen parallel backend: OpenMP 2025-12-04T12:54:29.2937664Z 2025-12-04T12:54:29.5273963Z + [[ distributed == *numpy_2* ]] 2025-12-04T12:54:29.5274300Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-12-04T12:54:29.5274565Z + [[ distributed == *backward* ]] 2025-12-04T12:54:29.5274824Z + [[ distributed == *libtorch_agnostic_targetting* ]] 2025-12-04T12:54:29.5275082Z + [[ distributed == *xla* ]] 2025-12-04T12:54:29.5275287Z + [[ distributed == *vllm* ]] 2025-12-04T12:54:29.5275494Z + [[ distributed == *executorch* ]] 2025-12-04T12:54:29.5275721Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2025-12-04T12:54:29.5275964Z + [[ distributed == \q\u\a\n\t\i\z\a\t\i\o\n ]] 2025-12-04T12:54:29.5276205Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-12-04T12:54:29.5276473Z + [[ distributed == distributed ]] 2025-12-04T12:54:29.5276684Z + test_distributed 2025-12-04T12:54:29.5276879Z + echo 'Testing distributed python tests' 2025-12-04T12:54:29.5277109Z Testing distributed python tests 2025-12-04T12:54:29.5277405Z + python test/run_test.py --distributed-tests --shard 3 3 --verbose 2025-12-04T12:54:31.2326296Z Excluding distributed/rpc/test_faulty_agent on ROCm 2025-12-04T12:54:31.2326710Z Excluding distributed/rpc/test_tensorpipe_agent on ROCm 2025-12-04T12:54:31.2327047Z Excluding distributed/rpc/test_share_memory on ROCm 2025-12-04T12:54:31.2327390Z Excluding distributed/rpc/cuda/test_tensorpipe_agent on ROCm 2025-12-04T12:54:32.2792931Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-12-04T12:54:32.6266779Z Ignoring disabled issues: [''] 2025-12-04T12:54:32.6315928Z Found test times from artifacts 2025-12-04T12:54:32.6479428Z Found test times from artifacts 2025-12-04T12:54:32.6482590Z Running all tests 2025-12-04T12:54:32.6553817Z Running parallel tests on 1 processes 2025-12-04T12:54:32.6556155Z Name: tests to run (est. time: 169.99min) 2025-12-04T12:54:32.6556688Z Serial tests (114): 2025-12-04T12:54:32.6557033Z distributed/test_dynamo_distributed 1/1 2025-12-04T12:54:32.6557374Z distributed/tensor/test_op_schema 1/1 2025-12-04T12:54:32.6557693Z distributed/checkpoint/test_nested_dict 1/1 2025-12-04T12:54:32.6558065Z distributed/checkpoint/test_consolidate_hf_safetensors 1/1 2025-12-04T12:54:32.6558441Z distributed/tensor/test_dtensor_compile 3/4 2025-12-04T12:54:32.6558795Z distributed/checkpoint/_experimental/test_barriers 1/1 2025-12-04T12:54:32.6559134Z distributed/pipelining/test_transformer 1/1 2025-12-04T12:54:32.6559458Z distributed/flight_recorder/test_fr_analysis 1/1 2025-12-04T12:54:32.6559771Z distributed/_composable/test_contract 1/1 2025-12-04T12:54:32.6560072Z distributed/checkpoint/test_dedup_tensors 1/1 2025-12-04T12:54:32.6560441Z distributed/test_c10d_functional_native 1/1 2025-12-04T12:54:32.6560735Z distributed/pipelining/test_backward 1/1 2025-12-04T12:54:32.6561020Z distributed/test_nvshmem_triton 1/1 2025-12-04T12:54:32.6561299Z distributed/tensor/test_dtensor 1/3 2025-12-04T12:54:32.6561839Z distributed/test_cupy_as_tensor 1/1 2025-12-04T12:54:32.6562113Z distributed/fsdp/test_fsdp_fx 1/1 2025-12-04T12:54:32.6562375Z distributed/_tools/test_sac_ilp 1/1 2025-12-04T12:54:32.6562660Z distributed/checkpoint/test_hf_storage 1/1 2025-12-04T12:54:32.6562961Z distributed/pipelining/test_microbatch 1/1 2025-12-04T12:54:32.6563260Z distributed/tensor/test_placement_types 1/1 2025-12-04T12:54:32.6563595Z distributed/tensor/test_dtensor_dispatch_overhead 1/1 2025-12-04T12:54:32.6563980Z distributed/checkpoint/_experimental/test_checkpoint_reader 1/1 2025-12-04T12:54:32.6564395Z distributed/checkpoint/test_format_utils 1/1 2025-12-04T12:54:32.6564859Z distributed/test_aten_comm_compute_reordering 1/3 2025-12-04T12:54:32.6565176Z distributed/test_p2p_ipc 1/1 2025-12-04T12:54:32.6565447Z distributed/tensor/test_common_rules 1/1 2025-12-04T12:54:32.6565754Z distributed/checkpoint/test_hf_safetensor_e2e 1/1 2025-12-04T12:54:32.6566071Z distributed/_tools/test_sac_estimator 1/1 2025-12-04T12:54:32.6566363Z distributed/_tools/test_memory_tracker 1/1 2025-12-04T12:54:32.6566685Z distributed/checkpoint/_experimental/test_builder 1/1 2025-12-04T12:54:32.6567039Z distributed/_composable/test_replicate_with_fsdp 1/1 2025-12-04T12:54:32.6567359Z distributed/tensor/test_xla_integration 1/1 2025-12-04T12:54:32.6567679Z distributed/checkpoint/_experimental/test_types 1/1 2025-12-04T12:54:32.6568053Z distributed/tensor/experimental/test_register_sharding 1/1 2025-12-04T12:54:32.6568381Z distributed/test_backends 1/1 2025-12-04T12:54:32.6568657Z distributed/tensor/test_experimental_ops 1/1 2025-12-04T12:54:32.6568948Z distributed/checkpoint/test_quantized_hf_storage 1/1 2025-12-04T12:54:32.6569257Z distributed/_composable/test_composability/test_pp_composability 1/1 2025-12-04T12:54:32.6569558Z distributed/checkpoint/test_async_process_executor 1/1 2025-12-04T12:54:32.6569791Z distributed/tensor/test_tensor_ops 1/4 2025-12-04T12:54:32.6570001Z distributed/tensor/test_tensor_ops 4/4 2025-12-04T12:54:32.6570258Z distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 2025-12-04T12:54:32.6570487Z distributed/checkpoint/test_save_load_api 1/1 2025-12-04T12:54:32.6570728Z distributed/tensor/debug/test_comm_mode_features 1/1 2025-12-04T12:54:32.6570956Z distributed/checkpoint/test_traverse 1/1 2025-12-04T12:54:32.6571163Z distributed/tensor/test_random_ops 1/1 2025-12-04T12:54:32.6571403Z distributed/_composable/test_replicate_mixed_precision 1/1 2025-12-04T12:54:32.6571681Z distributed/_composable/fsdp/test_fully_shard_logging 1/1 2025-12-04T12:54:32.6571973Z distributed/_composable/fsdp/test_fully_shard_ignore_params 1/1 2025-12-04T12:54:32.6572256Z distributed/checkpoint/_experimental/test_staging 1/1 2025-12-04T12:54:32.6572533Z distributed/checkpoint/test_fsdp_tp_checkpoint_conversion 1/1 2025-12-04T12:54:32.6572785Z distributed/launcher/test_api 1/1 2025-12-04T12:54:32.6573009Z distributed/elastic/multiprocessing/test_api 1/1 2025-12-04T12:54:32.6573234Z distributed/fsdp/test_shard_utils 1/1 2025-12-04T12:54:32.6573459Z distributed/tensor/experimental/test_local_map 1/1 2025-12-04T12:54:32.6573683Z distributed/test_local_tensor 1/1 2025-12-04T12:54:32.6573912Z distributed/_composable/fsdp/test_fully_shard_state 1/1 2025-12-04T12:54:32.6574158Z distributed/checkpoint/test_tp_checkpoint 1/1 2025-12-04T12:54:32.6574373Z distributed/tensor/test_optimizers 1/1 2025-12-04T12:54:32.6574574Z distributed/test_symmetric_memory 1/1 2025-12-04T12:54:32.6574787Z distributed/_tools/test_runtime_estimator 1/1 2025-12-04T12:54:32.6575004Z distributed/fsdp/test_fsdp_memory 1/1 2025-12-04T12:54:32.6575202Z distributed/test_fake_pg 1/1 2025-12-04T12:54:32.6575408Z distributed/checkpoint/test_fsdp_model_state 1/1 2025-12-04T12:54:32.6575628Z distributed/fsdp/test_utils 1/1 2025-12-04T12:54:32.6575844Z distributed/tensor/parallel/test_tp_examples 1/1 2025-12-04T12:54:32.6576177Z distributed/_composable/fsdp/test_fully_shard_clip_grad_norm_ 1/1 2025-12-04T12:54:32.6576439Z distributed/tensor/debug/test_comm_mode 1/1 2025-12-04T12:54:32.6576648Z distributed/test_dist2 1/1 2025-12-04T12:54:32.6576878Z distributed/_composable/fsdp/test_fully_shard_grad_scaler 1/1 2025-12-04T12:54:32.6577122Z distributed/launcher/test_run 1/1 2025-12-04T12:54:32.6577338Z distributed/fsdp/test_fsdp_backward_prefetch 1/1 2025-12-04T12:54:32.6577565Z distributed/fsdp/test_fsdp_pure_fp16 1/1 2025-12-04T12:54:32.6577780Z distributed/checkpoint/test_checkpoint 1/1 2025-12-04T12:54:32.6577994Z distributed/fsdp/test_fsdp_apply 1/1 2025-12-04T12:54:32.6578280Z distributed/_composable/fsdp/test_fully_shard_frozen 1/1 2025-12-04T12:54:32.6578537Z distributed/checkpoint/test_hsdp_checkpoint 1/1 2025-12-04T12:54:32.6578755Z distributed/tensor/parallel/test_parallelize_api 1/1 2025-12-04T12:54:32.6578939Z distributed/fsdp/test_fsdp_state_dict 1/2 2025-12-04T12:54:32.6579127Z distributed/_composable/fsdp/test_fully_shard_init 1/1 2025-12-04T12:54:32.6579317Z distributed/fsdp/test_fsdp_flatten_params 1/1 2025-12-04T12:54:32.6579488Z distributed/test_distributed_spawn 3/7 2025-12-04T12:54:32.6579650Z distributed/test_distributed_spawn 6/7 2025-12-04T12:54:32.6579811Z distributed/test_serialization 1/1 2025-12-04T12:54:32.6579983Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 2025-12-04T12:54:32.6580217Z distributed/_composable/fsdp/test_fully_shard_comm 1/1 2025-12-04T12:54:32.6580421Z distributed/checkpoint/test_file_system_checkpoint 1/1 2025-12-04T12:54:32.6580603Z distributed/test_composability 1/1 2025-12-04T12:54:32.6580792Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 2025-12-04T12:54:32.6580965Z distributed/fsdp/test_fsdp_comm_hooks 1/1 2025-12-04T12:54:32.6581123Z distributed/_shard/test_sharder 1/1 2025-12-04T12:54:32.6581306Z distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 2025-12-04T12:54:32.6581501Z distributed/fsdp/test_fsdp_tp_integration 1/1 2025-12-04T12:54:32.6581689Z distributed/_shard/sharded_optim/test_sharded_optim 1/1 2025-12-04T12:54:32.6581904Z distributed/_composable/fsdp/test_fully_shard_state_dict 1/1 2025-12-04T12:54:32.6582099Z distributed/test_c10d_pypg 1/1 2025-12-04T12:54:32.6582251Z distributed/test_pg_wrapper 1/1 2025-12-04T12:54:32.6582433Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 1/1 2025-12-04T12:54:32.6582624Z distributed/nn/jit/test_instantiator 1/1 2025-12-04T12:54:32.6582810Z distributed/_shard/sharding_spec/test_sharding_spec 1/1 2025-12-04T12:54:32.6582992Z distributed/test_nccl 1/1 2025-12-04T12:54:32.6583137Z distributed/fsdp/test_fsdp_misc 1/1 2025-12-04T12:54:32.6583293Z distributed/fsdp/test_fsdp_meta 1/1 2025-12-04T12:54:32.6583460Z distributed/fsdp/test_fsdp_unshard_params 1/1 2025-12-04T12:54:32.6583643Z distributed/checkpoint/test_state_dict_utils 1/1 2025-12-04T12:54:32.6583836Z distributed/_shard/sharded_tensor/ops/test_init 1/1 2025-12-04T12:54:32.6584132Z distributed/_shard/sharded_tensor/ops/test_embedding 1/1 2025-12-04T12:54:32.6584696Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 2025-12-04T12:54:32.6584968Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 2025-12-04T12:54:32.6585225Z distributed/fsdp/test_fsdp_core 2/2 2025-12-04T12:54:32.6585420Z distributed/test_c10d_ucc 1/1 2025-12-04T12:54:32.6585613Z distributed/test_c10d_common 1/1 2025-12-04T12:54:32.6585825Z distributed/fsdp/test_fsdp_mixed_precision 1/1 2025-12-04T12:54:32.6586024Z distributed/test_c10d_nccl 2/2 2025-12-04T12:54:32.6586205Z Parallel tests (0): 2025-12-04T12:54:32.6586395Z Name: excluded (est. time: 0.0min) 2025-12-04T12:54:32.6594339Z Serial tests (0): 2025-12-04T12:54:32.6594459Z Parallel tests (0): 2025-12-04T12:54:32.6594646Z Running distributed/test_dynamo_distributed 1/1 ... [2025-12-04 12:54:32.655879][2237983.145857035] 2025-12-04T12:54:32.6594921Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:54:32.6595334Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_dynamo_distributed.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:54:32.656142] 2025-12-04T13:01:58.9156632Z 2025-12-04T13:01:58.9157759Z distributed/test_dynamo_distributed 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_dynamo_distributed_1.1_b16db1b1e4a37d1c_.log 2025-12-04T13:01:58.9174079Z Running 62 items in this shard: test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_call_method_forward, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_ddp_optimizer_inductor_strides_dont_specialize, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_hf_bert_ddp_aot_eager, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_hf_bert_ddp_inductor, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_issue90375, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_symbol_splitting, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_direct, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_indirect, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_no_binding, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_torture_multi, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_asymmetric_compilation, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_asymmetric_compilation_with_fx_cache, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_scalar, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_speculation_divergence, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_tensor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_dim_mismatch, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_graph_break_empty_graph_still_collective, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_missing_source, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_scalar_missing_source, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_type_mismatch, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_baseline_aot_eager_multiprocess, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_optimizer_cudagraph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_aot_eager, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_inductor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_setattr, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_unspecialized_forced_getattr_inline, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_unspecialized_forced_getattr_no_inline, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_get_pg_attr, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_guard_collective, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_aot_eager, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_aot_eager_static_graph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_inductor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_inductor_static_graph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_fsdp, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_fsdp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_multiproc_autotune, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_multiproc_autotune_dynamic_shapes, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_aot_autograd, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_async_subclass_no_specialize, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_compiled_flex_attention_full_model_ddp, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_compiled_flex_attention_local_ddp, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_custom_layer, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ddp_baseline_aot_eager, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ddp_baseline_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_empty_graph_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_dup_tensors_diff_source, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_dup_tensors_same_source, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_orig_params_assert, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_skip_guards, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_skip_register_attr_or_module, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_staticmethod, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_ctx_manager, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_layout_optimizations_inference, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_layout_optimizations_training, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_transpose, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_higher_order_op, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ignored_parameters, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_no_split 2025-12-04T13:01:58.9184662Z 2025-12-04T13:01:58.9184815Z Finished distributed/test_dynamo_distributed 1/1 ... [2025-12-04 13:01:58.915838][2238429.40581478], took 7.44min 2025-12-04T13:01:58.9185308Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:02:01.1325114Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:02:01.1325655Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:02:01.1326026Z Uploading artifacts took 0.00 seconds 2025-12-04T13:02:01.1326481Z Running distributed/tensor/test_op_schema 1/1 ... [2025-12-04 13:02:01.132327][2238431.622299434] 2025-12-04T13:02:01.1326902Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:02:01.1327789Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_op_schema.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:02:01.132642] 2025-12-04T13:02:03.3513191Z 2025-12-04T13:02:03.3514329Z distributed/tensor/test_op_schema 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_op_schema_1.1_7f47e51292880941_.log 2025-12-04T13:02:03.3516234Z Running 2 items in this shard: test/distributed/tensor/test_op_schema.py::TestOpSchema::test_equality_checks_lists_of_dtensor_spec, test/distributed/tensor/test_op_schema.py::TestOpSchema::test_equality_respects_static_attributes 2025-12-04T13:02:03.3517041Z 2025-12-04T13:02:03.3517334Z Finished distributed/tensor/test_op_schema 1/1 ... [2025-12-04 13:02:03.350969][2238433.840943573], took 0.04min 2025-12-04T13:02:03.3518382Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:02:03.3540702Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:02:03.3543057Z Running distributed/checkpoint/test_nested_dict 1/1 ... [2025-12-04 13:02:03.354183][2238433.844163575] 2025-12-04T13:02:03.3543438Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:02:03.3545084Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_nested_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:02:03.354373] 2025-12-04T13:02:05.8231268Z 2025-12-04T13:02:05.8232336Z distributed/checkpoint/test_nested_dict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_nested_dict_1.1_058fe9047d549002_.log 2025-12-04T13:02:05.8233544Z Running 2 items in this shard: test/distributed/checkpoint/test_nested_dict.py::TestFlattening::test_flattening_round_trip, test/distributed/checkpoint/test_nested_dict.py::TestFlattening::test_mapping 2025-12-04T13:02:05.8234180Z 2025-12-04T13:02:05.8234681Z Finished distributed/checkpoint/test_nested_dict 1/1 ... [2025-12-04 13:02:05.822733][2238436.312710282], took 0.04min 2025-12-04T13:02:05.8235556Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:02:05.8256450Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:02:05.8259004Z Running distributed/checkpoint/test_consolidate_hf_safetensors 1/1 ... [2025-12-04 13:02:05.825773][2238436.315754137] 2025-12-04T13:02:05.8259387Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:02:05.8260948Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_consolidate_hf_safetensors.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:02:05.825969] 2025-12-04T13:02:33.4354303Z 2025-12-04T13:02:33.4355276Z distributed/checkpoint/test_consolidate_hf_safetensors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_consolidate_hf_safetensors_1.1_7be5a100ba98431f_.log 2025-12-04T13:02:33.4358367Z Running 7 items in this shard: test/distributed/checkpoint/test_consolidate_hf_safetensors.py::TestConsolidateHFSafeTensors::test_calculate_max_contiguous_elements_valid_cases, test/distributed/checkpoint/test_consolidate_hf_safetensors.py::TestConsolidateHFSafeTensors::test_calculate_max_contiguous_elements_validations, test/distributed/checkpoint/test_consolidate_hf_safetensors.py::TestConsolidateHFSafeTensors::test_consolidate_one_file_with_two_ranks, test/distributed/checkpoint/test_consolidate_hf_safetensors.py::TestConsolidateHFSafeTensors::test_consolidate_to_one_file, test/distributed/checkpoint/test_consolidate_hf_safetensors.py::TestConsolidateHFSafeTensors::test_consolidate_to_two_files, test/distributed/checkpoint/test_consolidate_hf_safetensors.py::TestConsolidateHFSafeTensors::test_consolidate_with_two_ranks, test/distributed/checkpoint/test_consolidate_hf_safetensors.py::TestConsolidateHFSafeTensors::test_write_sub_tensor_to_file_optimized 2025-12-04T13:02:33.4361374Z 2025-12-04T13:02:33.4361648Z Finished distributed/checkpoint/test_consolidate_hf_safetensors 1/1 ... [2025-12-04 13:02:33.435152][2238463.925128759], took 0.46min 2025-12-04T13:02:33.4362482Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:02:33.4380681Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:02:33.4382874Z Running distributed/tensor/test_dtensor_compile 3/4 ... [2025-12-04 13:02:33.438173][2238463.928153525] 2025-12-04T13:02:33.4383151Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:02:33.4385101Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_dtensor_compile.py', '--shard-id=3', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:02:33.438369] 2025-12-04T13:03:52.8306875Z 2025-12-04T13:03:52.8307522Z distributed/tensor/test_dtensor_compile 3/4 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_compile_3.4_2492c8e2ca244aef_.log 2025-12-04T13:03:52.8310328Z Running 14 items in this shard: test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_basic_export, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_dynamic_loss_parallel_log_softmax, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_dynamic_recompiles, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_dynamic_slice, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_noncontiguous_output, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_partial_placement_redistribute_unbalanced_correct_strides, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dynamo_dtensor_from_local_dynamic_shapes, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dynamo_to_local_grad_placements_sequence, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_get_local_rank_compile, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_tp_compile_comm_reordering, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompileE2E::test_2d_fsdp_tp_ac_compile_use_ca_True, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompileE2E::test_compile_embedding_redistribute, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompileE2E::test_tp_compile_fullgraph_is_seq_parallel_False_use_ca_False, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompileE2E::test_tp_compile_fullgraph_is_seq_parallel_False_use_ca_True 2025-12-04T13:03:52.8312675Z 2025-12-04T13:03:52.8312824Z Finished distributed/tensor/test_dtensor_compile 3/4 ... [2025-12-04 13:03:52.830383][2238543.320358777], took 1.32min 2025-12-04T13:03:52.8313395Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:03:52.8335046Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:03:52.8338005Z Running distributed/checkpoint/_experimental/test_barriers 1/1 ... [2025-12-04 13:03:52.833638][2238543.32361838] 2025-12-04T13:03:52.8338487Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:03:52.8339654Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_barriers.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:03:52.833835] 2025-12-04T13:03:55.1021981Z 2025-12-04T13:03:55.1023717Z distributed/checkpoint/_experimental/test_barriers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_barriers_1.1_16aea4b5c15674a4_.log 2025-12-04T13:03:55.1026437Z Running 2 items in this shard: test/distributed/checkpoint/_experimental/test_barriers.py::TestBarriers::test_execute_barrier, test/distributed/checkpoint/_experimental/test_barriers.py::TestBarriers::test_tcpstore_barrier_initialization 2025-12-04T13:03:55.1027480Z 2025-12-04T13:03:55.1027932Z Finished distributed/checkpoint/_experimental/test_barriers 1/1 ... [2025-12-04 13:03:55.101887][2238545.591862095], took 0.04min 2025-12-04T13:03:55.1029429Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:03:55.1045119Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:03:55.1047556Z Running distributed/pipelining/test_transformer 1/1 ... [2025-12-04 13:03:55.104670][2238545.594650404] 2025-12-04T13:03:55.1047890Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:03:55.1049955Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/pipelining/test_transformer.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:03:55.104868] 2025-12-04T13:03:59.7272402Z 2025-12-04T13:03:59.7273783Z distributed/pipelining/test_transformer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_transformer_1.1_26882adc9454c705_.log 2025-12-04T13:03:59.7275151Z Running 1 items in this shard: test/distributed/pipelining/test_transformer.py::TransformerTestsCUDA::test_ir_cuda 2025-12-04T13:03:59.7275672Z 2025-12-04T13:03:59.7276068Z Finished distributed/pipelining/test_transformer 1/1 ... [2025-12-04 13:03:59.726867][2238550.216841416], took 0.08min 2025-12-04T13:03:59.7278954Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:03:59.7299451Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:03:59.7302101Z Running distributed/flight_recorder/test_fr_analysis 1/1 ... [2025-12-04 13:03:59.730079][2238550.220059519] 2025-12-04T13:03:59.7302571Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:03:59.7304207Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/flight_recorder/test_fr_analysis.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:03:59.730278] 2025-12-04T13:04:01.9985185Z 2025-12-04T13:04:01.9986034Z distributed/flight_recorder/test_fr_analysis 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.flight_recorder.test_fr_analysis_1.1_8152aacd3ccdc658_.log 2025-12-04T13:04:01.9988462Z Running 4 items in this shard: test/distributed/flight_recorder/test_fr_analysis.py::FlightRecorderEventTest::test_all_events, test/distributed/flight_recorder/test_fr_analysis.py::FlightRecorderEventTest::test_match_one_event, test/distributed/flight_recorder/test_fr_analysis.py::FlightMatchInfoTest::test_match_info, test/distributed/flight_recorder/test_fr_analysis.py::FlightRecorderE2ETest::testBuildDB 2025-12-04T13:04:01.9990077Z 2025-12-04T13:04:01.9990548Z Finished distributed/flight_recorder/test_fr_analysis 1/1 ... [2025-12-04 13:04:01.998214][2238552.488191669], took 0.04min 2025-12-04T13:04:01.9991011Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:04:02.0008640Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:04:02.0011471Z Running distributed/_composable/test_contract 1/1 ... [2025-12-04 13:04:02.000986][2238552.490966699] 2025-12-04T13:04:02.0012387Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:04:02.0013354Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/test_contract.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:04:02.001185] 2025-12-04T13:04:04.2193771Z 2025-12-04T13:04:04.2194775Z distributed/_composable/test_contract 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_contract_1.1_8b34c7c3d7281262_.log 2025-12-04T13:04:04.2197442Z Running 5 items in this shard: test/distributed/_composable/test_contract.py::TestContract::test_add_hooks, test/distributed/_composable/test_contract.py::TestContract::test_modify_fqn, test/distributed/_composable/test_contract.py::TestContract::test_multi_module_api, test/distributed/_composable/test_contract.py::TestContract::test_registry, test/distributed/_composable/test_contract.py::TestContract::test_state 2025-12-04T13:04:04.2198899Z 2025-12-04T13:04:04.2199242Z Finished distributed/_composable/test_contract 1/1 ... [2025-12-04 13:04:04.219080][2238554.709056454], took 0.04min 2025-12-04T13:04:04.2200440Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:04:04.2222202Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:04:04.2225007Z Running distributed/checkpoint/test_dedup_tensors 1/1 ... [2025-12-04 13:04:04.222351][2238554.712331126] 2025-12-04T13:04:04.2225397Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:04:04.2226704Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_dedup_tensors.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:04:04.222545] 2025-12-04T13:04:06.4407691Z 2025-12-04T13:04:06.4408887Z distributed/checkpoint/test_dedup_tensors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_dedup_tensors_1.1_d2f3fe9426031561_.log 2025-12-04T13:04:06.4410467Z Running 1 items in this shard: test/distributed/checkpoint/test_dedup_tensors.py::TestDedupTensor::test_dedup_shards 2025-12-04T13:04:06.4410948Z 2025-12-04T13:04:06.4411318Z Finished distributed/checkpoint/test_dedup_tensors 1/1 ... [2025-12-04 13:04:06.440480][2238556.930454855], took 0.04min 2025-12-04T13:04:06.4415141Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:04:06.4436262Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:04:06.4439023Z Running distributed/test_c10d_functional_native 1/1 ... [2025-12-04 13:04:06.443762][2238556.933742447] 2025-12-04T13:04:06.4439419Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:04:06.4441096Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_functional_native.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:04:06.443952] 2025-12-04T13:07:21.1094818Z 2025-12-04T13:07:21.1095925Z distributed/test_c10d_functional_native 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_functional_native_1.1_cad39553bc20816e_.log 2025-12-04T13:07:21.1103889Z Running 33 items in this shard: test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_all_gather_into_tensor_coalesced, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_all_gather_into_tensor_single, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_all_reduce_coalesced, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_all_reduce_coalesced_, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_all_reduce_single, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_all_reduce_single_, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_all_to_all_single, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_broadcast, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_fixed_striding, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_functional_collectives_inference_mode, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_inductor_dtypeview_memory_leak, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_reduce_scatter_tensor_out, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_reduce_scatter_tensor_single, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_threading, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_unwaited, test/distributed/test_c10d_functional_native.py::TestWithNCCL::test_wait_tensor, test/distributed/test_c10d_functional_native.py::PyWorkTest::test_collectives, test/distributed/test_c10d_functional_native.py::PyWorkTest::test_wait_tensor, test/distributed/test_c10d_functional_native.py::CompileTestCPU::test_inductor_all_reduce_cpu, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_all_gather_into_tensor_coalesced, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_all_gather_into_tensor_single, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_all_reduce_coalesced, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_all_reduce_non_contig_input, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_all_reduce_single, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_all_to_all_single, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_broadcast, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_inplace_op_on_view, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_reduce_scatter_tensor_single, test/distributed/test_c10d_functional_native.py::CompileTest::test_inductor_reuse_buffer_after_inplace_collective, test/distributed/test_c10d_functional_native.py::CompileTest::test_ranks_and_tag, test/distributed/test_c10d_functional_native.py::CompileTest::test_wait_tensor 2025-12-04T13:07:21.1110003Z 2025-12-04T13:07:21.1110257Z Finished distributed/test_c10d_functional_native 1/1 ... [2025-12-04 13:07:21.109282][2238751.599255899], took 3.24min 2025-12-04T13:07:21.1110882Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:07:21.1116974Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:07:21.1119464Z Running distributed/pipelining/test_backward 1/1 ... [2025-12-04 13:07:21.111867][2238751.601847662] 2025-12-04T13:07:21.1119705Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:07:21.1121935Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/pipelining/test_backward.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:07:21.112065] 2025-12-04T13:07:27.7361267Z 2025-12-04T13:07:27.7362690Z distributed/pipelining/test_backward 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_backward_1.1_93d27088815687a1_.log 2025-12-04T13:07:27.7365351Z Running 5 items in this shard: test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_input_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_grad_validation_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_multiple_iters_cuda 2025-12-04T13:07:27.7366900Z 2025-12-04T13:07:27.7367353Z Finished distributed/pipelining/test_backward 1/1 ... [2025-12-04 13:07:27.735730][2238758.225704685], took 0.11min 2025-12-04T13:07:27.7368201Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:07:27.7384133Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:07:27.7386761Z Running distributed/test_nvshmem_triton 1/1 ... [2025-12-04 13:07:27.738601][2238758.228581694] 2025-12-04T13:07:27.7387057Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:07:27.7389071Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_nvshmem_triton.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:07:27.738800] 2025-12-04T13:07:32.9106649Z 2025-12-04T13:07:32.9107933Z distributed/test_nvshmem_triton 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_nvshmem_triton_1.1_569ca2f8653a3ea7_.log 2025-12-04T13:07:32.9108707Z 2025-12-04T13:07:32.9109071Z Finished distributed/test_nvshmem_triton 1/1 ... [2025-12-04 13:07:32.910328][2238763.400304159], took 0.09min 2025-12-04T13:07:32.9112993Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:07:32.9132657Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:07:32.9135488Z Running distributed/tensor/test_dtensor 1/3 ... [2025-12-04 13:07:32.913438][2238763.403418474] 2025-12-04T13:07:32.9135795Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:07:32.9137330Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_dtensor.py', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:07:32.913626] 2025-12-04T13:08:26.2626502Z 2025-12-04T13:08:26.2629626Z distributed/tensor/test_dtensor 1/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_1.3_0a15940e5c0bd0c3_.log 2025-12-04T13:08:26.2639574Z Running 28 items in this shard: test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_new_empty_strided, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_uneven_sharding, test/distributed/tensor/test_dtensor.py::DTensorTest::test_full_tensor_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTest::test_shard_tensor_2d, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_properties, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_stride, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_uneven_sharding, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_uneven_sharding_raise_error, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_modules_w_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_device_mesh_nd, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_api_device_mesh_context_manager, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_cond, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_from_local_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_inplace_on_local_tensor_view, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_redistribute_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_default_value_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_2d_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_api_device_mesh_context_manager, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_cond, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_device_mesh_device_conversion, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_from_local_sub_mesh, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_default_shard_order_generation, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_update, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_print, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_with_invalid_shard_order 2025-12-04T13:08:26.2643742Z 2025-12-04T13:08:26.2643880Z Finished distributed/tensor/test_dtensor 1/3 ... [2025-12-04 13:08:26.262708][2238816.752685241], took 0.89min 2025-12-04T13:08:26.2644446Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:08:26.2671978Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:08:26.2672335Z Running distributed/test_cupy_as_tensor 1/1 ... [2025-12-04 13:08:26.265273][2238816.755253454] 2025-12-04T13:08:26.2673729Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:08:26.2674280Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_cupy_as_tensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:08:26.265466] 2025-12-04T13:08:31.5881625Z 2025-12-04T13:08:31.5883523Z distributed/test_cupy_as_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_cupy_as_tensor_1.1_b8e7a25d5ab2fea7_.log 2025-12-04T13:08:31.5884675Z Running 1 items in this shard: test/distributed/test_cupy_as_tensor.py::CupyAsTensorTest::test_cupy_as_tensor 2025-12-04T13:08:31.5885125Z 2025-12-04T13:08:31.5885440Z Finished distributed/test_cupy_as_tensor 1/1 ... [2025-12-04 13:08:31.587849][2238822.077823498], took 0.09min 2025-12-04T13:08:31.5890155Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:08:31.5910237Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:08:31.5912497Z Running distributed/fsdp/test_fsdp_fx 1/1 ... [2025-12-04 13:08:31.591124][2238822.08110478] 2025-12-04T13:08:31.5912872Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:08:31.5914872Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_fx.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:08:31.591341] 2025-12-04T13:08:34.5610376Z 2025-12-04T13:08:34.5611612Z distributed/fsdp/test_fsdp_fx 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_fx_1.1_5e0a1afcfc52497b_.log 2025-12-04T13:08:34.5615899Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_fx.py::TestSymbolicTracingCUDA::test_symbolic_tracing_outputs_cuda 2025-12-04T13:08:34.5616393Z 2025-12-04T13:08:34.5616697Z Finished distributed/fsdp/test_fsdp_fx 1/1 ... [2025-12-04 13:08:34.560694][2238825.050668447], took 0.05min 2025-12-04T13:08:34.5618392Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:08:34.5639801Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:08:34.5640277Z Running distributed/_tools/test_sac_ilp 1/1 ... [2025-12-04 13:08:34.563824][2238825.053805032] 2025-12-04T13:08:34.5640591Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:08:34.5642705Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_sac_ilp.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:08:34.564035] 2025-12-04T13:08:38.8860490Z 2025-12-04T13:08:38.8861614Z distributed/_tools/test_sac_ilp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_sac_ilp_1.1_0e3d0b8bcf5ec37b_.log 2025-12-04T13:08:38.8863251Z Running 4 items in this shard: test/distributed/_tools/test_sac_ilp.py::TestSACILP::test_sac_ilp_case1, test/distributed/_tools/test_sac_ilp.py::TestSACILP::test_sac_ilp_case2, test/distributed/_tools/test_sac_ilp.py::TestSACILP::test_sac_ilp_case3, test/distributed/_tools/test_sac_ilp.py::TestOptimalCheckpointingPolicy::test_get_optimial_checkpointing_policy_per_module 2025-12-04T13:08:38.8864313Z 2025-12-04T13:08:38.8864582Z Finished distributed/_tools/test_sac_ilp 1/1 ... [2025-12-04 13:08:38.885678][2238829.375652896], took 0.07min 2025-12-04T13:08:38.8867212Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:08:38.8883907Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:08:38.8887529Z Running distributed/checkpoint/test_hf_storage 1/1 ... [2025-12-04 13:08:38.888460][2238829.378441465] 2025-12-04T13:08:38.8887858Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:08:38.8890644Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_hf_storage.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:08:38.888662] 2025-12-04T13:08:41.1066291Z 2025-12-04T13:08:41.1067467Z distributed/checkpoint/test_hf_storage 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_hf_storage_1.1_e67fd8423bbd0d75_.log 2025-12-04T13:08:41.1069476Z Running 5 items in this shard: test/distributed/checkpoint/test_hf_storage.py::TestHfStorage::test_read_data_hf, test/distributed/checkpoint/test_hf_storage.py::TestHfStorage::test_read_metadata_hf, test/distributed/checkpoint/test_hf_storage.py::TestHfStorage::test_write_data_hf, test/distributed/checkpoint/test_hf_storage.py::TestHfStorage::test_write_data_with_sharding, test/distributed/checkpoint/test_hf_storage.py::TestHfStorage::test_write_metadata_hf 2025-12-04T13:08:41.1071068Z 2025-12-04T13:08:41.1071366Z Finished distributed/checkpoint/test_hf_storage 1/1 ... [2025-12-04 13:08:41.106325][2238831.59630099], took 0.04min 2025-12-04T13:08:41.1073850Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:08:41.1094036Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:08:41.1094437Z Running distributed/pipelining/test_microbatch 1/1 ... [2025-12-04 13:08:41.109245][2238831.599226558] 2025-12-04T13:08:41.1094767Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:08:41.1096299Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/pipelining/test_microbatch.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:08:41.109447] 2025-12-04T13:08:54.5977187Z 2025-12-04T13:08:54.5979180Z distributed/pipelining/test_microbatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_microbatch_1.1_affd8957f5b75db3_.log 2025-12-04T13:08:54.5982305Z Running 5 items in this shard: test/distributed/pipelining/test_microbatch.py::MicrobatchTestsCUDA::test_chunk_spec_cuda, test/distributed/pipelining/test_microbatch.py::MicrobatchTestsCUDA::test_split_and_merge_cuda, test/distributed/pipelining/test_microbatch.py::MicrobatchTestsCUDA::test_split_block_mask_batch_size_one_cuda, test/distributed/pipelining/test_microbatch.py::MicrobatchTestsCUDA::test_split_block_mask_cuda, test/distributed/pipelining/test_microbatch.py::MicrobatchTestsCUDA::test_split_block_mask_none_cuda 2025-12-04T13:08:54.5984473Z 2025-12-04T13:08:54.5984874Z Finished distributed/pipelining/test_microbatch 1/1 ... [2025-12-04 13:08:54.597340][2238845.087314539], took 0.22min 2025-12-04T13:08:54.5986157Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:08:54.6005018Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:08:54.6005402Z Running distributed/tensor/test_placement_types 1/1 ... [2025-12-04 13:08:54.600366][2238845.090347466] 2025-12-04T13:08:54.6005720Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:08:54.6008239Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_placement_types.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:08:54.600572] 2025-12-04T13:08:56.7184171Z 2025-12-04T13:08:56.7185611Z distributed/tensor/test_placement_types 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_placement_types_1.1_15a6ab6c872e57e7_.log 2025-12-04T13:08:56.7188730Z Running 5 items in this shard: test/distributed/tensor/test_placement_types.py::PlacementTypesTestCase::test_dynamo_can_identify_placement_classes, test/distributed/tensor/test_placement_types.py::PlacementTypesTestCase::test_equality, test/distributed/tensor/test_placement_types.py::PlacementTypesTestCase::test_strided_shard_isinstance_shard, test/distributed/tensor/test_placement_types.py::PlacementTypesTestCase::test_strided_shard_kwonly_argument, test/distributed/tensor/test_placement_types.py::PlacementTypesTestCase::test_type_identification 2025-12-04T13:08:56.7190894Z 2025-12-04T13:08:56.7191236Z Finished distributed/tensor/test_placement_types 1/1 ... [2025-12-04 13:08:56.718084][2238847.20805936], took 0.04min 2025-12-04T13:08:56.7192262Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:08:56.7212958Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:08:56.7213458Z Running distributed/tensor/test_dtensor_dispatch_overhead 1/1 ... [2025-12-04 13:08:56.721160][2238847.211141006] 2025-12-04T13:08:56.7213849Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:08:56.7215775Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_dtensor_dispatch_overhead.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:08:56.721364] 2025-12-04T13:09:03.5480005Z 2025-12-04T13:09:03.5481264Z distributed/tensor/test_dtensor_dispatch_overhead 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_dispatch_overhead_1.1_2ba90c7656932c9a_.log 2025-12-04T13:09:03.5482244Z Running 1 items in this shard: test/distributed/tensor/test_dtensor_dispatch_overhead.py::DistOpDispatchOverHead::test_dtensor_add_op_dispatch_overhead 2025-12-04T13:09:03.5483064Z 2025-12-04T13:09:03.5483332Z Finished distributed/tensor/test_dtensor_dispatch_overhead 1/1 ... [2025-12-04 13:09:03.547697][2238854.037674363], took 0.11min 2025-12-04T13:09:03.5488676Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:09:03.5507069Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:09:03.5509370Z Running distributed/checkpoint/_experimental/test_checkpoint_reader 1/1 ... [2025-12-04 13:09:03.550799][2238854.040779709] 2025-12-04T13:09:03.5509724Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:09:03.5511422Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_checkpoint_reader.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:09:03.551004] 2025-12-04T13:09:05.9193963Z 2025-12-04T13:09:05.9195550Z distributed/checkpoint/_experimental/test_checkpoint_reader 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_checkpoint_reader_1.1_c639a6bd2388b4d0_.log 2025-12-04T13:09:05.9200058Z Running 7 items in this shard: test/distributed/checkpoint/_experimental/test_checkpoint_reader.py::TestCheckpointReader::test_partial_read, test/distributed/checkpoint/_experimental/test_checkpoint_reader.py::TestCheckpointReader::test_partial_read_different_dtypes, test/distributed/checkpoint/_experimental/test_checkpoint_reader.py::TestCheckpointReader::test_partial_read_missing_keys, test/distributed/checkpoint/_experimental/test_checkpoint_reader.py::TestCheckpointReader::test_read_checkpoint, test/distributed/checkpoint/_experimental/test_checkpoint_reader.py::TestCheckpointReader::test_read_nonexistent_checkpoint, test/distributed/checkpoint/_experimental/test_checkpoint_reader.py::TestCheckpointReader::test_read_with_kwargs, test/distributed/checkpoint/_experimental/test_checkpoint_reader.py::TestCheckpointReader::test_read_with_map_location 2025-12-04T13:09:05.9202917Z 2025-12-04T13:09:05.9203282Z Finished distributed/checkpoint/_experimental/test_checkpoint_reader 1/1 ... [2025-12-04 13:09:05.919046][2238856.409021594], took 0.04min 2025-12-04T13:09:05.9204343Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:09:05.9220954Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:09:05.9225308Z Running distributed/checkpoint/test_format_utils 1/1 ... [2025-12-04 13:09:05.922154][2238856.412134759] 2025-12-04T13:09:05.9225655Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:09:05.9226306Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_format_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:09:05.922353] 2025-12-04T13:09:17.8563730Z 2025-12-04T13:09:17.8564853Z distributed/checkpoint/test_format_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_format_utils_1.1_a0dfb3e96b94611b_.log 2025-12-04T13:09:17.8567068Z Running 3 items in this shard: test/distributed/checkpoint/test_format_utils.py::TestFormatUtils::test_dcp_to_torch_save, test/distributed/checkpoint/test_format_utils.py::TestFormatUtils::test_online_torch_save_to_dcp, test/distributed/checkpoint/test_format_utils.py::TestFormatUtils::test_torch_save_to_dcp 2025-12-04T13:09:17.8567981Z 2025-12-04T13:09:17.8568272Z Finished distributed/checkpoint/test_format_utils 1/1 ... [2025-12-04 13:09:17.856064][2238868.346039959], took 0.20min 2025-12-04T13:09:17.8573014Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:09:17.8591282Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:09:17.8593077Z Running distributed/test_aten_comm_compute_reordering 1/3 ... [2025-12-04 13:09:17.859173][2238868.349154144] 2025-12-04T13:09:17.8593433Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:09:17.8594989Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_aten_comm_compute_reordering.py', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:09:17.859372] 2025-12-04T13:12:36.4420369Z 2025-12-04T13:12:36.4421582Z distributed/test_aten_comm_compute_reordering 1/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_aten_comm_compute_reordering_1.3_b2b98bea2abd25a3_.log 2025-12-04T13:12:36.4429166Z Running 19 items in this shard: test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingMultiProc::test_custom_estimator_for_non_compute_nodes, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingMultiProc::test_inductor_default_comms_ordering, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingMultiProc::test_sink_waits, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingMultiProc::test_sink_waits_raise_comms, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_bucket_exposed_with_hidden_single_overlap, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_bucketing_split_for_overlap, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_bucketing_split_for_overlap_blocking_deps_inductor, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_bucketing_split_for_overlap_blocking_no_deps, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_collective_benchmarking_with_real_pg, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_grouped_scheduler_node, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_inductor_default_comms_ordering, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_multiple_hiding_nodes_bucketing, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_overlap_scheduling_via_config, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_sink_waits, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_sink_waits_raise_comms, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_inductor_default_comms_ordering, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_make_graph_view_and_get_subgraph_by_path, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_reorder_compute_for_overlap_mul, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_sink_waits 2025-12-04T13:12:36.4435312Z 2025-12-04T13:12:36.4435560Z Finished distributed/test_aten_comm_compute_reordering 1/3 ... [2025-12-04 13:12:36.441873][2239066.931849012], took 3.31min 2025-12-04T13:12:36.4436278Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:12:36.4448567Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:12:36.4450580Z Running distributed/test_p2p_ipc 1/1 ... [2025-12-04 13:12:36.444933][2239066.934914448] 2025-12-04T13:12:36.4450815Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:12:36.4452865Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_p2p_ipc.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:12:36.445160] 2025-12-04T13:12:41.0678913Z 2025-12-04T13:12:41.0679811Z distributed/test_p2p_ipc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_p2p_ipc_1.1_e11b956276118cc1_.log 2025-12-04T13:12:41.0680420Z Running 1 items in this shard: test/distributed/test_p2p_ipc.py::P2PIpcTest::test_p2p_ipc 2025-12-04T13:12:41.0680585Z 2025-12-04T13:12:41.0680711Z Finished distributed/test_p2p_ipc 1/1 ... [2025-12-04 13:12:41.067569][2239071.557544182], took 0.08min 2025-12-04T13:12:41.0688379Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:12:41.0709583Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:12:41.0710648Z Running distributed/tensor/test_common_rules 1/1 ... [2025-12-04 13:12:41.070858][2239071.560839395] 2025-12-04T13:12:41.0710884Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:12:41.0712629Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_common_rules.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:12:41.071071] 2025-12-04T13:12:45.3431692Z 2025-12-04T13:12:45.3432866Z distributed/tensor/test_common_rules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_common_rules_1.1_0cd253581f6c1c51_.log 2025-12-04T13:12:45.3436299Z Running 10 items in this shard: test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_einop_basic_propagation, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_einop_errors, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_einop_linearity, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_einop_merge_sharding, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_einop_multi_sharding_on_mesh_dim, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_einop_pointwise_propagation, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_pointwise_enforce_sharding_multi_sharding_on_mesh_dim, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_pointwise_multi_sharding_on_mesh_dim, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_pointwise_rules_broadcasting, test/distributed/tensor/test_common_rules.py::CommonRulesTest::test_pointwise_rules_suggestion 2025-12-04T13:12:45.3439104Z 2025-12-04T13:12:45.3439388Z Finished distributed/tensor/test_common_rules 1/1 ... [2025-12-04 13:12:45.342849][2239075.832825307], took 0.07min 2025-12-04T13:12:45.3441726Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:12:45.3462320Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:12:45.3462964Z Running distributed/checkpoint/test_hf_safetensor_e2e 1/1 ... [2025-12-04 13:12:45.346144][2239075.83612497] 2025-12-04T13:12:45.3463300Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:12:45.3466740Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_hf_safetensor_e2e.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:12:45.346356] 2025-12-04T13:13:15.3137357Z 2025-12-04T13:13:15.3142665Z distributed/checkpoint/test_hf_safetensor_e2e 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_hf_safetensor_e2e_1.1_99bd4ae9a4382135_.log 2025-12-04T13:13:15.3148460Z Running 11 items in this shard: test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestSingleRankSaveLoad::test_load, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestSingleRankSaveLoad::test_load_into_empty_dict, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestSingleRankSaveLoad::test_load_with_multiple_threads, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestSingleRankSaveLoad::test_quantized_checkpoint_loading, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestSingleRankSaveLoad::test_save, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestDistributedHFSafetensorsConsolidation::test_consolidate_to_one_file, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestDTensorReshardPlacementChange::test_1d_to_1d_reshard_placement_change, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestDTensorReshardPlacementChange::test_2d_to_2d_reshard_placement_change, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestDTensorReshardMeshChange::test_1d_to_2d_reshard_mesh_change, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestDTensorReshardMeshChange::test_2d_to_1d_reshard_mesh_change, test/distributed/checkpoint/test_hf_safetensor_e2e.py::TestDTensorReshardMeshChange::test_dtensor_checkpoint_resharding_with_empty_shard 2025-12-04T13:13:15.3151738Z 2025-12-04T13:13:15.3151989Z Finished distributed/checkpoint/test_hf_safetensor_e2e 1/1 ... [2025-12-04 13:13:15.313385][2239105.803361761], took 0.50min 2025-12-04T13:13:15.3152738Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:13:15.3158954Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:13:15.3162306Z Running distributed/_tools/test_sac_estimator 1/1 ... [2025-12-04 13:13:15.315969][2239105.805950084] 2025-12-04T13:13:15.3162861Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:13:15.3163777Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_sac_estimator.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:13:15.316188] 2025-12-04T13:13:19.4375583Z 2025-12-04T13:13:19.4376532Z distributed/_tools/test_sac_estimator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_sac_estimator_1.1_6bf32c20d943f613_.log 2025-12-04T13:13:19.4377806Z Running 2 items in this shard: test/distributed/_tools/test_sac_estimator.py::TestSACEstimator::test_simple_model_sac_estimation, test/distributed/_tools/test_sac_estimator.py::TestSACEstimator::test_transformer_sac_estimation 2025-12-04T13:13:19.4378502Z 2025-12-04T13:13:19.4378770Z Finished distributed/_tools/test_sac_estimator 1/1 ... [2025-12-04 13:13:19.437189][2239109.927165877], took 0.07min 2025-12-04T13:13:19.4382697Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:13:19.4399523Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:13:19.4401182Z Running distributed/_tools/test_memory_tracker 1/1 ... [2025-12-04 13:13:19.439997][2239109.929978557] 2025-12-04T13:13:19.4401518Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:13:19.4403663Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_memory_tracker.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:13:19.440212] 2025-12-04T13:13:25.5159971Z 2025-12-04T13:13:25.5161679Z distributed/_tools/test_memory_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_memory_tracker_1.1_038cb399725d7f3a_.log 2025-12-04T13:13:25.5163011Z Running 1 items in this shard: test/distributed/_tools/test_memory_tracker.py::TestMemoryTracker::test_local_model 2025-12-04T13:13:25.5163503Z 2025-12-04T13:13:25.5163891Z Finished distributed/_tools/test_memory_tracker 1/1 ... [2025-12-04 13:13:25.515583][2239116.005558111], took 0.10min 2025-12-04T13:13:25.5170893Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:13:25.5189945Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:13:25.5192254Z Running distributed/checkpoint/_experimental/test_builder 1/1 ... [2025-12-04 13:13:25.519058][2239116.009039571] 2025-12-04T13:13:25.5192685Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:13:25.5193828Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_builder.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:13:25.519254] 2025-12-04T13:13:29.6406241Z 2025-12-04T13:13:29.6407391Z distributed/checkpoint/_experimental/test_builder 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_builder_1.1_a4cc5fb57a47b619_.log 2025-12-04T13:13:29.6409959Z Running 4 items in this shard: test/distributed/checkpoint/_experimental/test_builder.py::TestMakeCheckpointer::test_make_async_checkpointer, test/distributed/checkpoint/_experimental/test_builder.py::TestMakeCheckpointer::test_make_sync_checkpointer, test/distributed/checkpoint/_experimental/test_builder.py::TestMakeCheckpointer::test_make_sync_checkpointer_with_config_first, test/distributed/checkpoint/_experimental/test_builder.py::TestMakeCheckpointer::test_make_sync_checkpointer_with_custom_config 2025-12-04T13:13:29.6411896Z 2025-12-04T13:13:29.6412271Z Finished distributed/checkpoint/_experimental/test_builder 1/1 ... [2025-12-04 13:13:29.640311][2239120.130286406], took 0.07min 2025-12-04T13:13:29.6416049Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:13:29.6434806Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:13:29.6437254Z Running distributed/_composable/test_replicate_with_fsdp 1/1 ... [2025-12-04 13:13:29.643610][2239120.133590979] 2025-12-04T13:13:29.6437625Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:13:29.6439442Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/test_replicate_with_fsdp.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:13:29.643815] 2025-12-04T13:14:00.0633932Z 2025-12-04T13:14:00.0635656Z distributed/_composable/test_replicate_with_fsdp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_replicate_with_fsdp_1.1_f4eca2853ca317d7_.log 2025-12-04T13:14:00.0638193Z Running 5 items in this shard: test/distributed/_composable/test_replicate_with_fsdp.py::ReplicateTest::test_replicate_tp_device_mesh, test/distributed/_composable/test_replicate_with_fsdp.py::ReplicateTest::test_replicate_transformer, test/distributed/_composable/test_replicate_with_fsdp.py::ReplicateTest::test_replicate_transformer_managed_modules, test/distributed/_composable/test_replicate_with_fsdp.py::ReplicateTest::test_train_parity_2d_mlp, test/distributed/_composable/test_replicate_with_fsdp.py::ReplicateTest::test_train_replicate_fsdp 2025-12-04T13:14:00.0639884Z 2025-12-04T13:14:00.0640387Z Finished distributed/_composable/test_replicate_with_fsdp 1/1 ... [2025-12-04 13:14:00.063161][2239150.553135604], took 0.51min 2025-12-04T13:14:00.0641896Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:00.0657762Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:00.0659895Z Running distributed/tensor/test_xla_integration 1/1 ... [2025-12-04 13:14:00.065890][2239150.555870885] 2025-12-04T13:14:00.0660287Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:00.0662466Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_xla_integration.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:00.066107] 2025-12-04T13:14:02.1341451Z 2025-12-04T13:14:02.1342407Z distributed/tensor/test_xla_integration 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_xla_integration_1.1_90f298e4a03bcef5_.log 2025-12-04T13:14:02.1344431Z Running 3 items in this shard: test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_1d_replicate, test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_1d_shard, test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_2d 2025-12-04T13:14:02.1345759Z 2025-12-04T13:14:02.1346101Z Finished distributed/tensor/test_xla_integration 1/1 ... [2025-12-04 13:14:02.133874][2239152.623849396], took 0.03min 2025-12-04T13:14:02.1353696Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:02.1375188Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:02.1377670Z Running distributed/checkpoint/_experimental/test_types 1/1 ... [2025-12-04 13:14:02.137636][2239152.627617772] 2025-12-04T13:14:02.1378086Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:02.1379713Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_types.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:02.137846] 2025-12-04T13:14:04.3061973Z 2025-12-04T13:14:04.3062790Z distributed/checkpoint/_experimental/test_types 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_types_1.1_5d7f08754faaed88_.log 2025-12-04T13:14:04.3064028Z Running 3 items in this shard: test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_rank_info_default_initialization, test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_rank_info_initialization, test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_state_dict_type_alias 2025-12-04T13:14:04.3065199Z 2025-12-04T13:14:04.3065428Z Finished distributed/checkpoint/_experimental/test_types 1/1 ... [2025-12-04 13:14:04.305871][2239154.795845631], took 0.04min 2025-12-04T13:14:04.3071406Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:04.3087333Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:04.3089733Z Running distributed/tensor/experimental/test_register_sharding 1/1 ... [2025-12-04 13:14:04.308855][2239154.798836428] 2025-12-04T13:14:04.3090157Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:04.3091939Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/experimental/test_register_sharding.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:04.309063] 2025-12-04T13:14:20.5023210Z 2025-12-04T13:14:20.5024838Z distributed/tensor/experimental/test_register_sharding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.experimental.test_register_sharding_1.1_62117a2a5ee6d3e0_.log 2025-12-04T13:14:20.5027378Z Running 3 items in this shard: test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_argmax, test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_register_sharding_for_tensor_kwargs, test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_softmax_fwd 2025-12-04T13:14:20.5028923Z 2025-12-04T13:14:20.5029394Z Finished distributed/tensor/experimental/test_register_sharding 1/1 ... [2025-12-04 13:14:20.502066][2239170.992040239], took 0.27min 2025-12-04T13:14:20.5034702Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:20.5051803Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:20.5053821Z Running distributed/test_backends 1/1 ... [2025-12-04 13:14:20.505262][2239170.995243614] 2025-12-04T13:14:20.5054110Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:20.5055836Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_backends.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:20.505469] 2025-12-04T13:14:23.4753849Z 2025-12-04T13:14:23.4755021Z distributed/test_backends 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_backends_1.1_f5b00254cfd3368b_.log 2025-12-04T13:14:23.4756643Z Running 2 items in this shard: test/distributed/test_backends.py::TestMiscCollectiveUtilsCUDA::test_create_pg_cuda, test/distributed/test_backends.py::TestMiscCollectiveUtilsCUDA::test_device_to_backend_mapping_cuda 2025-12-04T13:14:23.4757561Z 2025-12-04T13:14:23.4757897Z Finished distributed/test_backends 1/1 ... [2025-12-04 13:14:23.474987][2239173.964961097], took 0.05min 2025-12-04T13:14:23.4763927Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:23.4779538Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:23.4781518Z Running distributed/tensor/test_experimental_ops 1/1 ... [2025-12-04 13:14:23.478027][2239173.968008044] 2025-12-04T13:14:23.4781886Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:23.4783561Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_experimental_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:23.478232] 2025-12-04T13:14:39.4219363Z 2025-12-04T13:14:39.4220154Z distributed/tensor/test_experimental_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_experimental_ops_1.1_db4c79a3613d6e71_.log 2025-12-04T13:14:39.4222111Z Running 6 items in this shard: test/distributed/tensor/test_experimental_ops.py::DistOtherOpsTest::test_bernoulli, test/distributed/tensor/test_experimental_ops.py::DistOtherOpsTest::test_nll, test/distributed/tensor/test_experimental_ops.py::DistOtherOpsTest::test_slice, test/distributed/tensor/test_experimental_ops.py::DistOtherOpsTestWithLocalTensor::test_bernoulli, test/distributed/tensor/test_experimental_ops.py::DistOtherOpsTestWithLocalTensor::test_nll, test/distributed/tensor/test_experimental_ops.py::DistOtherOpsTestWithLocalTensor::test_slice 2025-12-04T13:14:39.4223206Z 2025-12-04T13:14:39.4223400Z Finished distributed/tensor/test_experimental_ops 1/1 ... [2025-12-04 13:14:39.421605][2239189.911581721], took 0.27min 2025-12-04T13:14:39.4236261Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:39.4243788Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:39.4246035Z Running distributed/checkpoint/test_quantized_hf_storage 1/1 ... [2025-12-04 13:14:39.424528][2239189.914509329] 2025-12-04T13:14:39.4246268Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:39.4248197Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_quantized_hf_storage.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:39.424727] 2025-12-04T13:14:41.6436222Z 2025-12-04T13:14:41.6437491Z distributed/checkpoint/test_quantized_hf_storage 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_quantized_hf_storage_1.1_475e22ffac889e6a_.log 2025-12-04T13:14:41.6439491Z Running 2 items in this shard: test/distributed/checkpoint/test_quantized_hf_storage.py::TestQuantizedHfStorage::test_dequantization, test/distributed/checkpoint/test_quantized_hf_storage.py::TestQuantizedHfStorage::test_dtensor_slice_dequantization_block_alignment 2025-12-04T13:14:41.6440800Z 2025-12-04T13:14:41.6441244Z Finished distributed/checkpoint/test_quantized_hf_storage 1/1 ... [2025-12-04 13:14:41.643296][2239192.133272977], took 0.04min 2025-12-04T13:14:41.6445412Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:41.6461466Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:41.6463392Z Running distributed/_composable/test_composability/test_pp_composability 1/1 ... [2025-12-04 13:14:41.646215][2239192.136195885] 2025-12-04T13:14:41.6463824Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:41.6465627Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/test_composability/test_pp_composability.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:41.646418] 2025-12-04T13:14:43.6148461Z 2025-12-04T13:14:43.6149789Z distributed/_composable/test_composability/test_pp_composability 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_composability.test_pp_composability_1.1_9ca7fbe0ae998448_.log 2025-12-04T13:14:43.6162023Z Running 26 items in this shard: test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass0_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass0_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass1_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass1_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass2_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass2_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass3_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass3_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass4_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_3d_with_tp_dp_pp_ScheduleClass4_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_pp_and_dcp, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass0_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass0_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass1_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass1_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass2_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass2_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass3_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass3_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass4_bfloat16, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_ScheduleClass4_float32, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_grads_ScheduleClass0, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_grads_ScheduleClass1, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_grads_ScheduleClass2, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_grads_ScheduleClass3, test/distributed/_composable/test_composability/test_pp_composability.py::ComposabilityTest::test_replicate_pp_grads_ScheduleClass4 2025-12-04T13:14:43.6169807Z 2025-12-04T13:14:43.6170077Z Finished distributed/_composable/test_composability/test_pp_composability 1/1 ... [2025-12-04 13:14:43.614474][2239194.104450133], took 0.03min 2025-12-04T13:14:43.6170854Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:14:43.6177876Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:14:43.6180878Z Running distributed/checkpoint/test_async_process_executor 1/1 ... [2025-12-04 13:14:43.617920][2239194.107901044] 2025-12-04T13:14:43.6181637Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:14:43.6182718Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_async_process_executor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:14:43.618138] 2025-12-04T13:15:10.2271439Z 2025-12-04T13:15:10.2272698Z distributed/checkpoint/test_async_process_executor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_async_process_executor_1.1_80236fabcf18f83d_.log 2025-12-04T13:15:10.2276288Z Running 5 items in this shard: test/distributed/checkpoint/test_async_process_executor.py::TestAsyncProcessExecutor::test_checkpoint_save_failure_continues_serving, test/distributed/checkpoint/test_async_process_executor.py::TestAsyncProcessExecutorPrefixStore::test_checkpoint_save_with_prefix_store_enabled, test/distributed/checkpoint/test_async_process_executor.py::TestProcessGroupInitInfo::test_process_group_init_info_with_default_pg, test/distributed/checkpoint/test_async_process_executor.py::TestProcessGroupInitInfo::test_process_group_init_info_with_prefix_store_env_var, test/distributed/checkpoint/test_async_process_executor.py::TestProcessGroupInitInfo::test_process_group_init_info_without_prefix_store_env_var 2025-12-04T13:15:10.2278482Z 2025-12-04T13:15:10.2278815Z Finished distributed/checkpoint/test_async_process_executor 1/1 ... [2025-12-04 13:15:10.226927][2239220.716903282], took 0.44min 2025-12-04T13:15:10.2283032Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:15:10.2300032Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:15:10.2302168Z Running distributed/tensor/test_tensor_ops 1/4 ... [2025-12-04 13:15:10.230098][2239220.720079787] 2025-12-04T13:15:10.2302485Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:15:10.2304355Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_tensor_ops.py', '--shard-id=1', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:15:10.230313] 2025-12-04T13:15:50.5121901Z 2025-12-04T13:15:50.5122697Z distributed/tensor/test_tensor_ops 1/4 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_tensor_ops_1.4_26a00dbdb7a1624c_.log 2025-12-04T13:15:50.5125583Z Running 11 items in this shard: test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_aten_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_detach, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_equal, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index_put_scalar, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_op_out_variant, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_slice, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_unbind, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zeros_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_clone, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_new_full 2025-12-04T13:15:50.5127842Z 2025-12-04T13:15:50.5129604Z Finished distributed/tensor/test_tensor_ops 1/4 ... [2025-12-04 13:15:50.511947][2239261.001922237], took 0.67min 2025-12-04T13:15:50.5135361Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:15:50.5152684Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:15:50.5154620Z Running distributed/tensor/test_tensor_ops 4/4 ... [2025-12-04 13:15:50.515351][2239261.005332798] 2025-12-04T13:15:50.5154888Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:15:50.5156734Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_tensor_ops.py', '--shard-id=4', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:15:50.515554] 2025-12-04T13:16:27.4940608Z 2025-12-04T13:16:27.4942279Z distributed/tensor/test_tensor_ops 4/4 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_tensor_ops_4.4_6109af99b1af1977_.log 2025-12-04T13:16:27.4948688Z Running 16 items in this shard: test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index_put_tensor, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_new_full, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_ones_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_ones_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_stack_cache, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zeros_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_detach, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_empty_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_fill_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_fill_inplace_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_gather, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index_put_scalar, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index_put_tensor, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_new_empty_strided, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_stack_cache 2025-12-04T13:16:27.4953214Z 2025-12-04T13:16:27.4953496Z Finished distributed/tensor/test_tensor_ops 4/4 ... [2025-12-04 13:16:27.493698][2239297.983676359], took 0.62min 2025-12-04T13:16:27.4954371Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:16:27.4965365Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:16:27.4967179Z Running distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 ... [2025-12-04 13:16:27.496599][2239297.986579958] 2025-12-04T13:16:27.4967506Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:16:27.4969518Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/fsdp/test_fsdp_dsd.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:16:27.496825] 2025-12-04T13:17:15.1407704Z 2025-12-04T13:17:15.1409076Z distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.fsdp.test_fsdp_dsd_1.1_6f3185450491354c_.log 2025-12-04T13:17:15.1413441Z Running 6 items in this shard: test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_1d_fsdp_cpu_offload_full_model_state_dict, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_1d_fsdp_get_model_state_dict, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp1_and_load_with_fsdp2, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp1_and_load_with_fsdp2_tp, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp2_tp_and_load_with_tp, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_tp_and_load_with_fsdp2_tp 2025-12-04T13:17:15.1417384Z 2025-12-04T13:17:15.1417816Z Finished distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 ... [2025-12-04 13:17:15.140562][2239345.630536588], took 0.79min 2025-12-04T13:17:15.1419335Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:17:15.1437408Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:17:15.1439952Z Running distributed/checkpoint/test_save_load_api 1/1 ... [2025-12-04 13:17:15.143895][2239345.633875611] 2025-12-04T13:17:15.1440248Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:17:15.1442245Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_save_load_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:17:15.144099] 2025-12-04T13:17:25.6764697Z 2025-12-04T13:17:25.6766142Z distributed/checkpoint/test_save_load_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_save_load_api_1.1_3ef36bfc34cbbeb5_.log 2025-12-04T13:17:25.6767801Z Running 2 items in this shard: test/distributed/checkpoint/test_save_load_api.py::TestSaveAndLoadAPI::test_assert_same_keys, test/distributed/checkpoint/test_save_load_api.py::TestSaveAndLoadAPI::test_auto_detect 2025-12-04T13:17:25.6768477Z 2025-12-04T13:17:25.6768802Z Finished distributed/checkpoint/test_save_load_api 1/1 ... [2025-12-04 13:17:25.676135][2239356.166109289], took 0.18min 2025-12-04T13:17:25.6778377Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:17:25.6797913Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:17:25.6800719Z Running distributed/tensor/debug/test_comm_mode_features 1/1 ... [2025-12-04 13:17:25.679925][2239356.169905485] 2025-12-04T13:17:25.6801101Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:17:25.6802619Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/debug/test_comm_mode_features.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:17:25.680126] 2025-12-04T13:17:57.9007060Z 2025-12-04T13:17:57.9008523Z distributed/tensor/debug/test_comm_mode_features 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.debug.test_comm_mode_features_1.1_23eef69f2c6b62c5_.log 2025-12-04T13:17:57.9011712Z Running 4 items in this shard: test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLPStacked_distributed_sharding_display, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLP_distributed_sharding_display, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLP_module_tracing, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_transformer_module_tracing 2025-12-04T13:17:57.9013741Z 2025-12-04T13:17:57.9014182Z Finished distributed/tensor/debug/test_comm_mode_features 1/1 ... [2025-12-04 13:17:57.900416][2239388.390391889], took 0.54min 2025-12-04T13:17:57.9020624Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:17:57.9037592Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:17:57.9039645Z Running distributed/checkpoint/test_traverse 1/1 ... [2025-12-04 13:17:57.903831][2239388.393812131] 2025-12-04T13:17:57.9039897Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:17:57.9041744Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_traverse.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:17:57.904043] 2025-12-04T13:18:00.0717341Z 2025-12-04T13:18:00.0721303Z distributed/checkpoint/test_traverse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_traverse_1.1_59730c1e2b43272d_.log 2025-12-04T13:18:00.0723511Z Running 7 items in this shard: test/distributed/checkpoint/test_traverse.py::TestTraverse::test_get_element, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_set_element, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_doesnt_ignore_intermediate_collections, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_nested_dict, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_nested_list, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_shallow, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_with_ordered_dict 2025-12-04T13:18:00.0725171Z 2025-12-04T13:18:00.0725426Z Finished distributed/checkpoint/test_traverse 1/1 ... [2025-12-04 13:18:00.071356][2239390.561332208], took 0.04min 2025-12-04T13:18:00.0727939Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:18:00.0745911Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:18:00.0749053Z Running distributed/tensor/test_random_ops 1/1 ... [2025-12-04 13:18:00.074717][2239390.56469642] 2025-12-04T13:18:00.0749496Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:18:00.0750795Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_random_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:18:00.074951] 2025-12-04T13:19:04.4983290Z 2025-12-04T13:19:04.4984405Z distributed/tensor/test_random_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_random_ops_1.1_89005651b5eec195_.log 2025-12-04T13:19:04.4995429Z Running 28 items in this shard: test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTest::test_fsdp_tp_model_meta_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTest::test_init_ops, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTest::test_init_with_user_generator, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTest::test_meta_tensor_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTest::test_tp_model_meta_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_deterministic_dropout_1d, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_deterministic_rand_1d, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_deterministic_uniform_2d, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_manual_seed, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_manual_seed_submesh, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_philox_state_seed_roundtrip, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_pipeline_parallel_manual_seed, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTest::test_rng_tracker_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpsTest3D::test_hsdp_tp_model_meta_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTestWithLocalTensor::test_fsdp_tp_model_meta_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTestWithLocalTensor::test_init_ops, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTestWithLocalTensor::test_init_with_user_generator, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTestWithLocalTensor::test_meta_tensor_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomInitTestWithLocalTensor::test_tp_model_meta_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_deterministic_dropout_1d, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_deterministic_rand_1d, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_deterministic_uniform_2d, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_manual_seed, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_manual_seed_submesh, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_philox_state_seed_roundtrip, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_pipeline_parallel_manual_seed, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpTestWithLocalTensor::test_rng_tracker_init, test/distributed/tensor/test_random_ops.py::DistTensorRandomOpsTest3DWithLocalTensor::test_hsdp_tp_model_meta_init 2025-12-04T13:19:04.5002382Z 2025-12-04T13:19:04.5002606Z Finished distributed/tensor/test_random_ops 1/1 ... [2025-12-04 13:19:04.498164][2239454.988140021], took 1.07min 2025-12-04T13:19:04.5003284Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:19:04.5015499Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:19:04.5017621Z Running distributed/_composable/test_replicate_mixed_precision 1/1 ... [2025-12-04 13:19:04.501686][2239454.991667041] 2025-12-04T13:19:04.5017878Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:19:04.5020045Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/test_replicate_mixed_precision.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:19:04.501893] 2025-12-04T13:19:49.1921756Z 2025-12-04T13:19:49.1923128Z distributed/_composable/test_replicate_mixed_precision 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_replicate_mixed_precision_1.1_2200dd02f29b536d_.log 2025-12-04T13:19:49.1927619Z Running 9 items in this shard: test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionTraining::test_compute_dtype, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionTraining::test_grad_acc_with_reduce_dtype, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionTraining::test_reduce_dtype, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionCasts::test_clamp_reduce_dtype, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionCasts::test_dataclass_input, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionCasts::test_float16_on_one_submodule, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionCasts::test_norm_modules_bf16, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionCasts::test_norm_modules_fp16, test/distributed/_composable/test_replicate_mixed_precision.py::TestReplicateMixedPrecisionCasts::test_submodules_with_external_inputs 2025-12-04T13:19:49.1932387Z 2025-12-04T13:19:49.1932668Z Finished distributed/_composable/test_replicate_mixed_precision 1/1 ... [2025-12-04 13:19:49.191788][2239499.681762947], took 0.74min 2025-12-04T13:19:49.1934737Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:19:49.1952979Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:19:49.1955099Z Running distributed/_composable/fsdp/test_fully_shard_logging 1/1 ... [2025-12-04 13:19:49.195352][2239499.685333206] 2025-12-04T13:19:49.1955433Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:19:49.1956742Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_logging.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:19:49.195556] 2025-12-04T13:19:51.1297133Z 2025-12-04T13:19:51.1298751Z distributed/_composable/fsdp/test_fully_shard_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_logging_1.1_39d50c56fa28da93_.log 2025-12-04T13:19:51.1299832Z Running 0 items in this shard: 2025-12-04T13:19:51.1300053Z 2025-12-04T13:19:51.1300557Z Finished distributed/_composable/fsdp/test_fully_shard_logging 1/1 ... [2025-12-04 13:19:51.129345][2239501.619321483], took 0.03min 2025-12-04T13:19:51.1305303Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:19:51.1321168Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:19:51.1323396Z Running distributed/_composable/fsdp/test_fully_shard_ignore_params 1/1 ... [2025-12-04 13:19:51.132228][2239501.622208922] 2025-12-04T13:19:51.1323843Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:19:51.1326162Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_ignore_params.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:19:51.132467] 2025-12-04T13:20:01.7666511Z 2025-12-04T13:20:01.7668078Z distributed/_composable/fsdp/test_fully_shard_ignore_params 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_ignore_params_1.1_e25fd7b77028f9d1_.log 2025-12-04T13:20:01.7669693Z Running 1 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_ignore_params.py::TestFullyShardIgnoreParams::test_ddp_A_fsdp_B_ddp_C 2025-12-04T13:20:01.7670543Z 2025-12-04T13:20:01.7670876Z Finished distributed/_composable/fsdp/test_fully_shard_ignore_params 1/1 ... [2025-12-04 13:20:01.766342][2239512.256319476], took 0.18min 2025-12-04T13:20:01.7675167Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:20:01.7690511Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:20:01.7692755Z Running distributed/checkpoint/_experimental/test_staging 1/1 ... [2025-12-04 13:20:01.769180][2239512.259160626] 2025-12-04T13:20:01.7693119Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:20:01.7697054Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_staging.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:20:01.769387] 2025-12-04T13:20:04.2381031Z 2025-12-04T13:20:04.2386005Z distributed/checkpoint/_experimental/test_staging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_staging_1.1_4b702d11a445c3f0_.log 2025-12-04T13:20:04.2390789Z Running 7 items in this shard: test/distributed/checkpoint/_experimental/test_staging.py::TestDefaultStager::test_async_staging, test/distributed/checkpoint/_experimental/test_staging.py::TestDefaultStager::test_cuda_non_blocking_without_cuda, test/distributed/checkpoint/_experimental/test_staging.py::TestDefaultStager::test_cuda_tensors_staging, test/distributed/checkpoint/_experimental/test_staging.py::TestDefaultStager::test_different_option_combinations, test/distributed/checkpoint/_experimental/test_staging.py::TestDefaultStager::test_multiple_staging_operations, test/distributed/checkpoint/_experimental/test_staging.py::TestDefaultStager::test_resource_cleanup, test/distributed/checkpoint/_experimental/test_staging.py::TestDefaultStager::test_sync_staging 2025-12-04T13:20:04.2392782Z 2025-12-04T13:20:04.2393046Z Finished distributed/checkpoint/_experimental/test_staging 1/1 ... [2025-12-04 13:20:04.237734][2239514.72771002], took 0.04min 2025-12-04T13:20:04.2393913Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:20:04.2411365Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:20:04.2413610Z Running distributed/checkpoint/test_fsdp_tp_checkpoint_conversion 1/1 ... [2025-12-04 13:20:04.241232][2239514.73121275] 2025-12-04T13:20:04.2413929Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:20:04.2415768Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_fsdp_tp_checkpoint_conversion.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:20:04.241432] 2025-12-04T13:20:15.2756502Z 2025-12-04T13:20:15.2757444Z distributed/checkpoint/test_fsdp_tp_checkpoint_conversion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_fsdp_tp_checkpoint_conversion_1.1_fa9c9cb498ec3d7d_.log 2025-12-04T13:20:15.2758116Z Running 1 items in this shard: test/distributed/checkpoint/test_fsdp_tp_checkpoint_conversion.py::TestFsdpTpCheckpointConversion::test_fsdp_to_tp 2025-12-04T13:20:15.2758352Z 2025-12-04T13:20:15.2758570Z Finished distributed/checkpoint/test_fsdp_tp_checkpoint_conversion 1/1 ... [2025-12-04 13:20:15.275436][2239525.765410449], took 0.18min 2025-12-04T13:20:15.2770461Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:20:15.2787547Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:20:15.2789778Z Running distributed/launcher/test_api 1/1 ... [2025-12-04 13:20:15.278909][2239525.76888931] 2025-12-04T13:20:15.2789994Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:20:15.2792123Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/launcher/test_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:20:15.279122] 2025-12-04T13:20:17.4472189Z 2025-12-04T13:20:17.4473357Z distributed/launcher/test_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.launcher.test_api_1.1_1ec5b3fda6247e95_.log 2025-12-04T13:20:17.4474565Z Running 2 items in this shard: test/distributed/launcher/test_api.py::LauncherApiTest::test_launch_agent_default_signals, test/distributed/launcher/test_api.py::LauncherApiTest::test_launch_agent_sets_signals_env_var 2025-12-04T13:20:17.4475723Z 2025-12-04T13:20:17.4475980Z Finished distributed/launcher/test_api 1/1 ... [2025-12-04 13:20:17.446969][2239527.936944633], took 0.04min 2025-12-04T13:20:17.4487492Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:20:17.4504811Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:20:17.4507015Z Running distributed/elastic/multiprocessing/test_api 1/1 ... [2025-12-04 13:20:17.450597][2239527.940578442] 2025-12-04T13:20:17.4507379Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:20:17.4509394Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/multiprocessing/test_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:20:17.450803] 2025-12-04T13:20:19.5695647Z 2025-12-04T13:20:19.5696917Z distributed/elastic/multiprocessing/test_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.multiprocessing.test_api_1.1_0531c6b6ddd3cfb7_.log 2025-12-04T13:20:19.5700618Z Running 7 items in this shard: test/distributed/elastic/multiprocessing/test_api.py::SignalHandlingTest::test_start_handles_invalid_signals, test/distributed/elastic/multiprocessing/test_api.py::SignalHandlingTest::test_start_handles_windows_signals, test/distributed/elastic/multiprocessing/test_api.py::SignalHandlingTest::test_start_not_main_thread, test/distributed/elastic/multiprocessing/test_api.py::SignalHandlingTest::test_start_registers_custom_signals, test/distributed/elastic/multiprocessing/test_api.py::SignalHandlingTest::test_start_registers_default_signals, test/distributed/elastic/multiprocessing/test_api.py::SignalHandlingTest::test_start_supports_sigusr1_and_sigusr2, test/distributed/elastic/multiprocessing/test_api.py::SignalHandlingTest::test_terminate_process_handler 2025-12-04T13:20:19.5703239Z 2025-12-04T13:20:19.5703579Z Finished distributed/elastic/multiprocessing/test_api 1/1 ... [2025-12-04 13:20:19.569308][2239530.059283638], took 0.04min 2025-12-04T13:20:19.5709196Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:20:19.5726246Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:20:19.5728280Z Running distributed/fsdp/test_shard_utils 1/1 ... [2025-12-04 13:20:19.572741][2239530.062722049] 2025-12-04T13:20:19.5728590Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:20:19.5730632Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_shard_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:20:19.572955] 2025-12-04T13:20:30.1055566Z 2025-12-04T13:20:30.1056707Z distributed/fsdp/test_shard_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_shard_utils_1.1_5a2a063cfd30649d_.log 2025-12-04T13:20:30.1058059Z Running 2 items in this shard: test/distributed/fsdp/test_shard_utils.py::TestShardUtilsDistributed::test_create_chunk_sharded_tensor, test/distributed/fsdp/test_shard_utils.py::TestShardUtilsDistributedDTensor::test_create_chunk_dtensor 2025-12-04T13:20:30.1058827Z 2025-12-04T13:20:30.1059154Z Finished distributed/fsdp/test_shard_utils 1/1 ... [2025-12-04 13:20:30.105334][2239540.595307646], took 0.18min 2025-12-04T13:20:30.1071742Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:20:30.1092021Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:20:30.1094288Z Running distributed/tensor/experimental/test_local_map 1/1 ... [2025-12-04 13:20:30.109345][2239540.599325949] 2025-12-04T13:20:30.1094639Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:20:30.1096661Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/experimental/test_local_map.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:20:30.109556] 2025-12-04T13:21:06.8362635Z 2025-12-04T13:21:06.8367733Z distributed/tensor/experimental/test_local_map 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.experimental.test_local_map_1.1_1ea183dc430830ec_.log 2025-12-04T13:21:06.8369941Z Running 6 items in this shard: test/distributed/tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_correctness, test/distributed/tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_in_placements, test/distributed/tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_out_placements, test/distributed/tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_redistribute, test/distributed/tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_with_grad_placement, test/distributed/tensor/experimental/test_local_map.py::TestLocalMap::test_multi_mesh_inputs 2025-12-04T13:21:06.8371578Z 2025-12-04T13:21:06.8371838Z Finished distributed/tensor/experimental/test_local_map 1/1 ... [2025-12-04 13:21:06.835910][2239577.32588709], took 0.61min 2025-12-04T13:21:06.8375154Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:21:06.8393327Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:21:06.8394646Z Running distributed/test_local_tensor 1/1 ... [2025-12-04 13:21:06.839331][2239577.329311782] 2025-12-04T13:21:06.8394871Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:21:06.8396432Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_local_tensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:21:06.839544] 2025-12-04T13:21:09.3589870Z 2025-12-04T13:21:09.3591362Z distributed/test_local_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_local_tensor_1.1_4967d9bd1f817964_.log 2025-12-04T13:21:09.3599610Z Running 21 items in this shard: test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_basic_arithmetic_operations, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_collectives_within_local_tensor_mode, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_empty_local_tensors, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_even_sharding_mean_is_partial, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_local_tensor_creation_fails_with_grad_tensors, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_local_tensor_dtype_consistency, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_local_tensor_mode, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_mixed_operations_with_regular_tensors, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_scalar_mul_reduction_bug, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_uneven_sharding_mean_bug, test/distributed/test_local_tensor.py::TestLocalTensorWorld2::test_uneven_sharding_prod, test/distributed/test_local_tensor.py::TestLocalTensorWorld3::test_all_gather_collective, test/distributed/test_local_tensor.py::TestLocalTensorWorld3::test_all_gather_into_tensor_collective, test/distributed/test_local_tensor.py::TestLocalTensorWorld3::test_all_reduce_collective, test/distributed/test_local_tensor.py::TestLocalTensorWorld3::test_all_to_all_single_collective, test/distributed/test_local_tensor.py::TestLocalTensorWorld3::test_broadcast_collective, test/distributed/test_local_tensor.py::TestLocalTensorWorld3::test_collective_reduction_operations, test/distributed/test_local_tensor.py::TestLocalTensorWorld3::test_reduce_scatter_tensor_collective, test/distributed/test_local_tensor.py::TestLocalTensorWorld4::test_dtensor_cat, test/distributed/test_local_tensor.py::TestLocalTensorWorld8::test_dtensor_addmm, test/distributed/test_local_tensor.py::TestLocalRunner::test_dp_pp 2025-12-04T13:21:09.3604990Z 2025-12-04T13:21:09.3605313Z Finished distributed/test_local_tensor 1/1 ... [2025-12-04 13:21:09.358682][2239579.848657713], took 0.04min 2025-12-04T13:21:09.3606014Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:21:09.3621557Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:21:09.3623698Z Running distributed/_composable/fsdp/test_fully_shard_state 1/1 ... [2025-12-04 13:21:09.362272][2239579.852252772] 2025-12-04T13:21:09.3623962Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:21:09.3625766Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:21:09.362482] 2025-12-04T13:21:11.7310348Z 2025-12-04T13:21:11.7311092Z distributed/_composable/fsdp/test_fully_shard_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_state_1.1_1458c5c28d67dbf0_.log 2025-12-04T13:21:11.7313444Z Running 5 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_state.py::TestFullyShardState::test_fully_shard_cls, test/distributed/_composable/fsdp/test_fully_shard_state.py::TestFullyShardState::test_fully_shard_deepcopy, test/distributed/_composable/fsdp/test_fully_shard_state.py::TestFullyShardState::test_fully_shard_reapply, test/distributed/_composable/fsdp/test_fully_shard_state.py::TestFullyShardState::test_fully_shard_state, test/distributed/_composable/fsdp/test_fully_shard_state.py::TestFullyShardState::test_fully_shard_unsupported_module_cls 2025-12-04T13:21:11.7315154Z 2025-12-04T13:21:11.7315479Z Finished distributed/_composable/fsdp/test_fully_shard_state 1/1 ... [2025-12-04 13:21:11.730774][2239582.220749589], took 0.04min 2025-12-04T13:21:11.7324819Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:21:11.7342310Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:21:11.7344533Z Running distributed/checkpoint/test_tp_checkpoint 1/1 ... [2025-12-04 13:21:11.734342][2239582.224322668] 2025-12-04T13:21:11.7344868Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:21:11.7346556Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_tp_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:21:11.734538] 2025-12-04T13:21:27.6268079Z 2025-12-04T13:21:27.6269066Z distributed/checkpoint/test_tp_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_tp_checkpoint_1.1_9c08bb4c70ba1bc7_.log 2025-12-04T13:21:27.6270463Z Running 2 items in this shard: test/distributed/checkpoint/test_tp_checkpoint.py::TestTpCheckpoint::test_tp_checkpoint, test/distributed/checkpoint/test_tp_checkpoint.py::TestTpCheckpoint::test_tp_checkpoint_load_on_meta_device 2025-12-04T13:21:27.6271820Z 2025-12-04T13:21:27.6272099Z Finished distributed/checkpoint/test_tp_checkpoint 1/1 ... [2025-12-04 13:21:27.626528][2239598.11650423], took 0.26min 2025-12-04T13:21:27.6279906Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:21:27.6295618Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:21:27.6297995Z Running distributed/tensor/test_optimizers 1/1 ... [2025-12-04 13:21:27.629708][2239598.119688995] 2025-12-04T13:21:27.6298469Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:21:27.6300425Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_optimizers.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:21:27.629922] 2025-12-04T13:23:34.8395687Z 2025-12-04T13:23:34.8400096Z distributed/tensor/test_optimizers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_optimizers_1.1_54e633a4b18d8e3c_.log 2025-12-04T13:23:34.8407178Z Running 24 items in this shard: test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_RMSprop_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_adadelta_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_adagrad_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_adam_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_adamax_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_adamw_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_admaw_fused_across_meshes, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_asgd_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_nadam_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_optimizer_foreach_supported_types_include_DTensor, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_radam_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizer::test_sgd_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_RMSprop_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_adadelta_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_adagrad_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_adam_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_adamax_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_adamw_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_admaw_fused_across_meshes, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_asgd_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_nadam_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_optimizer_foreach_supported_types_include_DTensor, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_radam_1d_sharding, test/distributed/tensor/test_optimizers.py::TestDTensorOptimizerWithLocalTensor::test_sgd_1d_sharding 2025-12-04T13:23:34.8412909Z 2025-12-04T13:23:34.8413101Z Finished distributed/tensor/test_optimizers 1/1 ... [2025-12-04 13:23:34.839274][2239725.329250296], took 2.12min 2025-12-04T13:23:34.8413701Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:23:34.8425829Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:23:34.8426085Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:23:34.8426287Z Uploading artifacts took 0.00 seconds 2025-12-04T13:23:34.8428421Z Running distributed/test_symmetric_memory 1/1 ... [2025-12-04 13:23:34.842725][2239725.332706567] 2025-12-04T13:23:34.8428650Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:23:34.8432013Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_symmetric_memory.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:23:34.842943] 2025-12-04T13:24:09.0649338Z 2025-12-04T13:24:09.0653626Z distributed/test_symmetric_memory 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_symmetric_memory_1.1_ea40e054db50aa70_.log 2025-12-04T13:24:09.0678163Z Running 96 items in this shard: test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_allow_overlapping_devices, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_cuda_nvlink_connectivity_detection, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_get_backend, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_get_signal_pad, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_has_multicast_support, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_large_alloc, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_low_contention_all_gather_symm_mem_input_False, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_low_contention_all_gather_symm_mem_input_True, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_low_contention_reduce_scatter_reduce_op_avg_symm_mem_input_False, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_low_contention_reduce_scatter_reduce_op_avg_symm_mem_input_True, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_low_contention_reduce_scatter_reduce_op_sum_symm_mem_input_False, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_low_contention_reduce_scatter_reduce_op_sum_symm_mem_input_True, test/distributed/test_symmetric_memory.py::SymmetricMemoryTest::test_subgroup, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_matmul_gather_dim_0, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_matmul_gather_dim_1, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_matmul_gather_dim_2, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_matmul_native_symm_mem_input_False_is_b_row_major_False, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_matmul_native_symm_mem_input_False_is_b_row_major_True, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_matmul_native_symm_mem_input_True_is_b_row_major_False, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_matmul_native_symm_mem_input_True_is_b_row_major_True, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_scaled_matmul_gather_dim_0_scale_mode_row-wise-replicated, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_scaled_matmul_gather_dim_0_scale_mode_row-wise-sharded, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_scaled_matmul_gather_dim_0_scale_mode_tensor-wise, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_scaled_matmul_gather_dim_1_scale_mode_row-wise-replicated, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_scaled_matmul_gather_dim_1_scale_mode_row-wise-sharded, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_all_gather_scaled_matmul_gather_dim_1_scale_mode_tensor-wise, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_matmul_reduce_scatter_scatter_dim_0, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_matmul_reduce_scatter_scatter_dim_1, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_matmul_reduce_scatter_scatter_dim_2, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_scaled_matmul_reduce_scatter_scatter_dim_0_rowwise_False, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_scaled_matmul_reduce_scatter_scatter_dim_0_rowwise_True, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_scaled_matmul_reduce_scatter_scatter_dim_1_rowwise_False, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_fused_scaled_matmul_reduce_scatter_scatter_dim_1_rowwise_True, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_multimem_all_gather_matmul, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_optimal_layout_dim_0, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_optimal_layout_dim_1, test/distributed/test_symmetric_memory.py::AsyncTPTest::test_optimal_layout_dim_2, test/distributed/test_symmetric_memory.py::SymmMemEmptySetDeviceTest::test_empty_strided_p2p_persistent_set_device_False, test/distributed/test_symmetric_memory.py::SymmMemEmptySetDeviceTest::test_empty_strided_p2p_persistent_set_device_True, test/distributed/test_symmetric_memory.py::SymmMemEmptySetDeviceTest::test_empty_strided_p2p_set_device_False, test/distributed/test_symmetric_memory.py::SymmMemEmptySetDeviceTest::test_empty_strided_p2p_set_device_True, test/distributed/test_symmetric_memory.py::SymmMemNegativeTest::test_barrier_timeout, test/distributed/test_symmetric_memory.py::SymmMemNegativeTest::test_put_signal_timeout, test/distributed/test_symmetric_memory.py::SymmMemNegativeTest::test_wait_signal_timeout, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_gather_align_bytes_16, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_gather_align_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_gather_align_bytes_8, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_16_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_16_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_16_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_4_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_4_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_4_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_8_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_8_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_bfloat16_align_bytes_8_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_16_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_16_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_16_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_4_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_4_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_4_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_8_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_8_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_all_reduce_float32_align_bytes_8_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_16_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_16_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_16_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_4_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_4_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_4_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_8_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_8_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_bfloat16_align_bytes_8_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_16_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_16_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_16_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_4_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_4_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_4_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_8_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_8_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_all_reduce_float32_align_bytes_8_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_reduce_out_bfloat16_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_reduce_out_bfloat16_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_reduce_out_bfloat16_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_reduce_out_float32_size_bytes_4, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_reduce_out_float32_size_bytes_8192, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_multimem_one_shot_reduce_out_float32_size_bytes_8196, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_one_shot_all_reduce, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_reduce_scatter, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_reduce_scatter_corner_cases, test/distributed/test_symmetric_memory.py::SymmMemCollectiveTest::test_two_shot_all_reduce, test/distributed/test_symmetric_memory.py::LoweringTest::test_lowering_one_shot_all_reduce, test/distributed/test_symmetric_memory.py::SymmMemSingleProcTest::test_memset32, test/distributed/test_symmetric_memory.py::SymmMemSingleProcTest::test_stream_write_value32 2025-12-04T13:24:09.0693579Z 2025-12-04T13:24:09.0693712Z Finished distributed/test_symmetric_memory 1/1 ... [2025-12-04 13:24:09.064684][2239759.554657856], took 0.57min 2025-12-04T13:24:09.0694151Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:24:09.0694580Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:24:09.0694827Z Running distributed/_tools/test_runtime_estimator 1/1 ... [2025-12-04 13:24:09.068348][2239759.558328924] 2025-12-04T13:24:09.0695035Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:24:09.0695453Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_runtime_estimator.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:24:09.068558] 2025-12-04T13:24:47.6905880Z 2025-12-04T13:24:47.6906771Z distributed/_tools/test_runtime_estimator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_runtime_estimator_1.1_7a5cad7523ebd7ec_.log 2025-12-04T13:24:47.6907866Z Running 2 items in this shard: test/distributed/_tools/test_runtime_estimator.py::TestRuntimeEstimator::test_conv_model_runtime, test/distributed/_tools/test_runtime_estimator.py::TestRuntimeEstimator::test_transformer_runtime 2025-12-04T13:24:47.6908466Z 2025-12-04T13:24:47.6908709Z Finished distributed/_tools/test_runtime_estimator 1/1 ... [2025-12-04 13:24:47.690196][2239798.180173935], took 0.64min 2025-12-04T13:24:47.6916232Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:24:47.6931599Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:24:47.6933668Z Running distributed/fsdp/test_fsdp_memory 1/1 ... [2025-12-04 13:24:47.693223][2239798.183204402] 2025-12-04T13:24:47.6933965Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:24:47.6935804Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_memory.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:24:47.693440] 2025-12-04T13:25:15.1067131Z 2025-12-04T13:25:15.1071553Z distributed/fsdp/test_fsdp_memory 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_memory_1.1_6d38db6a21e08840_.log 2025-12-04T13:25:15.1072831Z Running 2 items in this shard: test/distributed/fsdp/test_fsdp_memory.py::TestFSDPMemory::test_fsdp_memory_ckpt_ckpt, test/distributed/fsdp/test_fsdp_memory.py::TestFSDPMemory::test_fsdp_memory_ckpt_no_ckpt 2025-12-04T13:25:15.1073454Z 2025-12-04T13:25:15.1073715Z Finished distributed/fsdp/test_fsdp_memory 1/1 ... [2025-12-04 13:25:15.106344][2239825.596317592], took 0.46min 2025-12-04T13:25:15.1082797Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:25:15.1100584Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:25:15.1102969Z Running distributed/test_fake_pg 1/1 ... [2025-12-04 13:25:15.110149][2239825.600130338] 2025-12-04T13:25:15.1103273Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:25:15.1104859Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_fake_pg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:25:15.110359] 2025-12-04T13:25:21.0360520Z 2025-12-04T13:25:21.0361761Z distributed/test_fake_pg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_fake_pg_1.1_85f898b4ba7c7b37_.log 2025-12-04T13:25:21.0367301Z Running 16 items in this shard: test/distributed/test_fake_pg.py::TestFakePG::test_all_reduce, test/distributed/test_fake_pg.py::TestFakePG::test_allgather, test/distributed/test_fake_pg.py::TestFakePG::test_alltoall, test/distributed/test_fake_pg.py::TestFakePG::test_alltoall_base, test/distributed/test_fake_pg.py::TestFakePG::test_broadcast, test/distributed/test_fake_pg.py::TestFakePG::test_construct_fsdp, test/distributed/test_fake_pg.py::TestFakePG::test_error_on_collective, test/distributed/test_fake_pg.py::TestFakePG::test_fake_pg_tracing, test/distributed/test_fake_pg.py::TestFakePG::test_fake_process_group_direct_usage_error, test/distributed/test_fake_pg.py::TestFakePG::test_fake_process_group_proper_usage_dispatch, test/distributed/test_fake_pg.py::TestFakePG::test_fsdp_fake_e2e, test/distributed/test_fake_pg.py::TestFakePG::test_fsdp_tp_fake_e2e, test/distributed/test_fake_pg.py::TestFakePG::test_recv, test/distributed/test_fake_pg.py::TestFakePG::test_reduce_scatter, test/distributed/test_fake_pg.py::TestFakePG::test_scatter, test/distributed/test_fake_pg.py::TestFakePG::test_send 2025-12-04T13:25:21.0370146Z 2025-12-04T13:25:21.0370404Z Finished distributed/test_fake_pg 1/1 ... [2025-12-04 13:25:21.035651][2239831.525628314], took 0.10min 2025-12-04T13:25:21.0371189Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:25:21.0385237Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:25:21.0387359Z Running distributed/checkpoint/test_fsdp_model_state 1/1 ... [2025-12-04 13:25:21.038613][2239831.528594042] 2025-12-04T13:25:21.0387682Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:25:21.0389257Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_fsdp_model_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:25:21.038808] 2025-12-04T13:25:38.1839563Z 2025-12-04T13:25:38.1840849Z distributed/checkpoint/test_fsdp_model_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_fsdp_model_state_1.1_93316b390f539f84_.log 2025-12-04T13:25:38.1842809Z Running 2 items in this shard: test/distributed/checkpoint/test_fsdp_model_state.py::FsdpModelStateCheckpoint::test_fsdp_model_state_no_resharding, test/distributed/checkpoint/test_fsdp_model_state.py::FsdpModelStateCheckpoint::test_fsdp_model_state_with_resharding 2025-12-04T13:25:38.1843952Z 2025-12-04T13:25:38.1844365Z Finished distributed/checkpoint/test_fsdp_model_state 1/1 ... [2025-12-04 13:25:38.183729][2239848.67370495], took 0.29min 2025-12-04T13:25:38.1856845Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:25:38.1873180Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:25:38.1875088Z Running distributed/fsdp/test_utils 1/1 ... [2025-12-04 13:25:38.187361][2239848.677341308] 2025-12-04T13:25:38.1875440Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:25:38.1877053Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:25:38.187563] 2025-12-04T13:25:41.3070076Z 2025-12-04T13:25:41.3070776Z distributed/fsdp/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_utils_1.1_6be9d2aac5827288_.log 2025-12-04T13:25:41.3073451Z Running 5 items in this shard: test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_apply_to_tensors_cpu_cuda_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_apply_to_tensors_device_list0_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_apply_to_tensors_device_list1_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_packed_sequence_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_replace_by_prefix_cuda 2025-12-04T13:25:41.3074813Z 2025-12-04T13:25:41.3075070Z Finished distributed/fsdp/test_utils 1/1 ... [2025-12-04 13:25:41.306681][2239851.796656733], took 0.05min 2025-12-04T13:25:41.3086990Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:25:41.3103770Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:25:41.3105922Z Running distributed/tensor/parallel/test_tp_examples 1/1 ... [2025-12-04 13:25:41.310468][2239851.800449039] 2025-12-04T13:25:41.3106296Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:25:41.3107885Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/parallel/test_tp_examples.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:25:41.310671] 2025-12-04T13:27:47.2951609Z 2025-12-04T13:27:47.2955861Z distributed/tensor/parallel/test_tp_examples 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_tp_examples_1.1_159ca9a88fb7f884_.log 2025-12-04T13:27:47.2965284Z Running 16 items in this shard: test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_loss_parallel, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_mlp_inference, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_mlp_training_is_seq_parallel_False_recompute_activation_False, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_mlp_training_is_seq_parallel_True_recompute_activation_False, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_req_grad_float64_thaw_all, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_req_grad_seq_parallel_float32_thaw_all, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_req_grad_seq_parallel_float32_thaw_layers_0_attention_wv__layers_0_feed_forward_w1__layers_1_feed_forward_w2__layers_1_ffn_norm__output__tok_embeddings, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_req_grad_seq_parallel_float32_thaw_layers_1_ffn_norm__norm__output__tok_embeddings, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_req_grad_seq_parallel_float32_thaw_norm__output, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_req_grad_seq_parallel_float32_thaw_norm__output__tok_embeddings, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_req_grad_seq_parallel_float32_thaw_output__tok_embeddings, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_training_is_seq_parallel_False_float32, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_training_is_seq_parallel_False_float64, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_training_is_seq_parallel_True_float32, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_transformer_training_is_seq_parallel_True_float64, test/distributed/tensor/parallel/test_tp_examples.py::DistTensorParallelExampleTest::test_weight_tying 2025-12-04T13:27:47.2972692Z 2025-12-04T13:27:47.2972934Z Finished distributed/tensor/parallel/test_tp_examples 1/1 ... [2025-12-04 13:27:47.294923][2239977.784898346], took 2.10min 2025-12-04T13:27:47.2973763Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:27:47.2984934Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:27:47.2986868Z Running distributed/_composable/fsdp/test_fully_shard_clip_grad_norm_ 1/1 ... [2025-12-04 13:27:47.298591][2239977.788572394] 2025-12-04T13:27:47.2987165Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:27:47.2989117Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_clip_grad_norm_.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:27:47.298799] 2025-12-04T13:28:12.7556488Z 2025-12-04T13:28:12.7560042Z distributed/_composable/fsdp/test_fully_shard_clip_grad_norm_ 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_clip_grad_norm__1.1_826c915f9fe4df95_.log 2025-12-04T13:28:12.7561294Z Running 2 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_clip_grad_norm_.py::TestClipGradNormWorldSize2::test_clip_grad_norm_1d, test/distributed/_composable/fsdp/test_fully_shard_clip_grad_norm_.py::TestClipGradNormWorldSize4::test_clip_grad_norm_2d 2025-12-04T13:28:12.7561874Z 2025-12-04T13:28:12.7562116Z Finished distributed/_composable/fsdp/test_fully_shard_clip_grad_norm_ 1/1 ... [2025-12-04 13:28:12.755370][2240003.245347652], took 0.42min 2025-12-04T13:28:12.7567545Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:28:12.7584619Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:28:12.7587407Z Running distributed/tensor/debug/test_comm_mode 1/1 ... [2025-12-04 13:28:12.758639][2240003.248619875] 2025-12-04T13:28:12.7587673Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:28:12.7589577Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/debug/test_comm_mode.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:28:12.758840] 2025-12-04T13:28:17.1306522Z 2025-12-04T13:28:17.1307588Z distributed/tensor/debug/test_comm_mode 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.debug.test_comm_mode_1.1_c3596a4b13d357dd_.log 2025-12-04T13:28:17.1309037Z Running 4 items in this shard: test/distributed/tensor/debug/test_comm_mode.py::TestCommMode::test_comm_mode, test/distributed/tensor/debug/test_comm_mode.py::TestCommMode::test_comm_mode_coalesced, test/distributed/tensor/debug/test_comm_mode.py::TestCommMode::test_comm_mode_with_c10d, test/distributed/tensor/debug/test_comm_mode.py::TestCommMode::test_comm_mode_with_dtensor 2025-12-04T13:28:17.1310069Z 2025-12-04T13:28:17.1310489Z Finished distributed/tensor/debug/test_comm_mode 1/1 ... [2025-12-04 13:28:17.130255][2240007.620230611], took 0.07min 2025-12-04T13:28:17.1323382Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:28:17.1341345Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:28:17.1343574Z Running distributed/test_dist2 1/1 ... [2025-12-04 13:28:17.134232][2240007.624213015] 2025-12-04T13:28:17.1343850Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:28:17.1345598Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_dist2.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:28:17.134429] 2025-12-04T13:29:56.6047940Z 2025-12-04T13:29:56.6049457Z distributed/test_dist2 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_dist2_1.1_01d8363cddc2bd49_.log 2025-12-04T13:29:56.6056415Z Running 34 items in this shard: test/distributed/test_dist2.py::ProcessGroupTest::test_context_manager, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_allgather, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_allreduce, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_alltoall_base, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_barrier, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_broadcast, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_gather, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_group_split, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_reduce, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_reduce_scatter, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_remote_group_merge, test/distributed/test_dist2.py::Dist2MultiProcessTestCase::test_scatter, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_allgather, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_allreduce, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_alltoall_base, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_barrier, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_broadcast, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_gather, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_group_split, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_reduce, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_reduce_scatter, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_remote_group_merge, test/distributed/test_dist2.py::ProcessGroupGlooTest::test_scatter, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_allgather, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_allreduce, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_alltoall_base, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_barrier, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_broadcast, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_gather, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_group_split, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_reduce, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_reduce_scatter, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_remote_group_merge, test/distributed/test_dist2.py::ProcessGroupNCCLTest::test_scatter 2025-12-04T13:29:56.6062623Z 2025-12-04T13:29:56.6062823Z Finished distributed/test_dist2 1/1 ... [2025-12-04 13:29:56.604494][2240107.094470035], took 1.66min 2025-12-04T13:29:56.6064126Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:29:56.6080844Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:29:56.6083270Z Running distributed/_composable/fsdp/test_fully_shard_grad_scaler 1/1 ... [2025-12-04 13:29:56.608227][2240107.098207952] 2025-12-04T13:29:56.6083728Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:29:56.6085576Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_grad_scaler.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:29:56.608423] 2025-12-04T13:30:09.9955257Z 2025-12-04T13:30:09.9956369Z distributed/_composable/fsdp/test_fully_shard_grad_scaler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_grad_scaler_1.1_58bc8ca2ee750627_.log 2025-12-04T13:30:09.9958595Z Running 1 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_grad_scaler.py::TestFullyShardGradientScaler::test_gradient_scaler 2025-12-04T13:30:09.9959144Z 2025-12-04T13:30:09.9959532Z Finished distributed/_composable/fsdp/test_fully_shard_grad_scaler 1/1 ... [2025-12-04 13:30:09.995233][2240120.485209903], took 0.22min 2025-12-04T13:30:09.9970136Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:30:09.9987982Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:30:09.9990227Z Running distributed/launcher/test_run 1/1 ... [2025-12-04 13:30:09.998906][2240120.48888713] 2025-12-04T13:30:09.9990590Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:30:09.9992695Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/launcher/test_run.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:30:09.999103] 2025-12-04T13:30:57.8422915Z 2025-12-04T13:30:57.8423585Z distributed/launcher/test_run 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.launcher.test_run_1.1_9f819f4620a1c222_.log 2025-12-04T13:30:57.8428378Z Running 26 items in this shard: test/distributed/launcher/test_run.py::ElasticLaunchTest::test_capture_logs_using_default_logs_specs, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_init_method_env_with_torchelastic, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_init_method_tcp_with_torchelastic, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_is_not_torchelastic_launched, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_is_torchelastic_launched, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_is_torchelastic_launched_with_logs_spec_defined, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic_agent_raise_exception, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic_multiple_agents, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic_worker_raise_exception, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_run_path, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_shutdown, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_standalone, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_bash, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_default_nproc, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_python, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_python_caffe2_bc, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_with_env_vars, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_logs_logs_spec_entrypoint_must_be_defined, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_min_max_nodes_parse, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_gpu_launch_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_launch_auto_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_launch_number_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_launch_unknown_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_xpu_launch_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_virtual_local_rank 2025-12-04T13:30:57.8433398Z 2025-12-04T13:30:57.8433537Z Finished distributed/launcher/test_run 1/1 ... [2025-12-04 13:30:57.842316][2240168.332292485], took 0.80min 2025-12-04T13:30:57.8441466Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:30:57.8458325Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:30:57.8461234Z Running distributed/fsdp/test_fsdp_backward_prefetch 1/1 ... [2025-12-04 13:30:57.845998][2240168.335978453] 2025-12-04T13:30:57.8461664Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:30:57.8463527Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_backward_prefetch.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:30:57.846202] 2025-12-04T13:31:07.5773511Z 2025-12-04T13:31:07.5774728Z distributed/fsdp/test_fsdp_backward_prefetch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_backward_prefetch_1.1_6ddbd664ed9f5158_.log 2025-12-04T13:31:07.5776145Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_backward_prefetch.py::TestBackwardPrefetch::test_backward_prefetch 2025-12-04T13:31:07.5776705Z 2025-12-04T13:31:07.5777113Z Finished distributed/fsdp/test_fsdp_backward_prefetch 1/1 ... [2025-12-04 13:31:07.577049][2240178.067023362], took 0.16min 2025-12-04T13:31:07.5791845Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:31:07.5810011Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:31:07.5812591Z Running distributed/fsdp/test_fsdp_pure_fp16 1/1 ... [2025-12-04 13:31:07.581121][2240178.071101685] 2025-12-04T13:31:07.5812950Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:31:07.5814591Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_pure_fp16.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:31:07.581322] 2025-12-04T13:32:16.5419263Z 2025-12-04T13:32:16.5423278Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 1/1 (test/test-reports/distributed.fsdp.test_fsdp_pure_fp16_1.1_af0f579fa03b5e35_.log) 2025-12-04T13:32:16.5424797Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-7d030560d6c9b3c6.xml 2025-12-04T13:32:16.5425632Z ============================= test session starts ============================== 2025-12-04T13:32:16.5426253Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:32:16.5426815Z cachedir: .pytest_cache 2025-12-04T13:32:16.5442780Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:32:16.5443251Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:32:16.5443484Z configfile: pytest.ini 2025-12-04T13:32:16.5443911Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:32:16.5444366Z collecting ... collected 2 items 2025-12-04T13:32:16.5445410Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T13:32:16.5446098Z Running 2 items in this shard: test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda, test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda 2025-12-04T13:32:16.5446627Z 2025-12-04T13:32:16.5447102Z distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda I1204 13:31:09.379000 231510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 231579 2025-12-04T13:32:16.5448039Z I1204 13:31:09.379000 231510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 231580 2025-12-04T13:32:16.5448618Z I1204 13:31:09.380000 231510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 231581 2025-12-04T13:32:16.5449161Z I1204 13:31:09.381000 231510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 231582 2025-12-04T13:32:16.5450357Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5451329Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5452271Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5453220Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5454152Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5455078Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5456007Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5456933Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5457563Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:32:16.5458153Z return func(*args, **kwargs) 2025-12-04T13:32:16.5458476Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5458909Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5459518Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5460106Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5460800Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5461351Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5461895Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5462509Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5463088Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5463655Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5464231Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5464783Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5465340Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5465916Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5466714Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 3. CUDA driver allocated memory was 2250244096 and is now 3988783104. 2025-12-04T13:32:16.5467441Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5467879Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5468528Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5468999Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5469367Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5469784Z [rank3]:E1204 13:31:17.679000 231582 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:32:16.5470030Z dist init r=3, world=4 2025-12-04T13:32:16.5470285Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5470626Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5471118Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5471634Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5472114Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5472562Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5473034Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5473501Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5473971Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5474436Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5474904Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5475358Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5475814Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5476281Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5476904Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 2. CUDA driver allocated memory was 2300575744 and is now 4039114752. 2025-12-04T13:32:16.5477482Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5477834Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5478383Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5478849Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5479214Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5479631Z [rank2]:E1204 13:31:17.691000 231581 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:32:16.5479877Z dist init r=2, world=4 2025-12-04T13:32:16.5480083Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5480484Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5480970Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5481451Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5481975Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5482419Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5482857Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5483317Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5483777Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5484242Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5484703Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5485152Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5485604Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5486067Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5486684Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 1. CUDA driver allocated memory was 2317352960 and is now 4055891968. 2025-12-04T13:32:16.5487259Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5487606Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5488152Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5488611Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5488974Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5489385Z [rank1]:E1204 13:31:17.697000 231580 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:32:16.5489657Z dist init r=1, world=4 2025-12-04T13:32:16.5489857Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5490236Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5490721Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5491227Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5491701Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5492146Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5492582Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5493043Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5493504Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5493962Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5494426Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5494875Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5495325Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5495786Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5496399Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 0. CUDA driver allocated memory was 2453667840 and is now 4192206848. 2025-12-04T13:32:16.5496977Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5497323Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5497872Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5498331Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5498692Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5499129Z [rank0]:E1204 13:31:17.776000 231579 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:32:16.5499368Z dist init r=0, world=4 2025-12-04T13:32:16.5499783Z [rank0]:[W1204 13:31:18.654236084 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:32:16.5500237Z FAILED [10.3236s] [ 50%] 2025-12-04T13:32:16.5500303Z 2025-12-04T13:32:16.5500393Z =================================== FAILURES =================================== 2025-12-04T13:32:16.5500577Z ____________________ TestPureFP16CUDA.test_fp16_dtypes_cuda ____________________ 2025-12-04T13:32:16.5500744Z Traceback (most recent call last): 2025-12-04T13:32:16.5500993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:32:16.5501238Z self._join_processes(fn) 2025-12-04T13:32:16.5501484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:32:16.5501748Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:32:16.5502013Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:32:16.5502270Z raise RuntimeError(error) 2025-12-04T13:32:16.5502424Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:32:16.5502585Z Traceback (most recent call last): 2025-12-04T13:32:16.5502825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5503066Z getattr(self, test_name)() 2025-12-04T13:32:16.5503298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5503530Z fn() 2025-12-04T13:32:16.5503732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5503963Z method(*args, **kwargs) 2025-12-04T13:32:16.5504183Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5504413Z method(*args, **kwargs) 2025-12-04T13:32:16.5504789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5505014Z with policy(): 2025-12-04T13:32:16.5505225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5505471Z raise RuntimeError(msg) 2025-12-04T13:32:16.5505845Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 2. CUDA driver allocated memory was 2300575744 and is now 4039114752. 2025-12-04T13:32:16.5506186Z 2025-12-04T13:32:16.5506262Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5506561Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5506789Z 2025-12-04T13:32:16.5506881Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5507009Z 2025-12-04T13:32:16.5507010Z 2025-12-04T13:32:16.5507094Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:32:16.5507295Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:32:16.5507700Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-7d030560d6c9b3c6.xml - 2025-12-04T13:32:16.5508041Z =========================== short test summary info ============================ 2025-12-04T13:32:16.5508345Z FAILED [10.3236s] distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:32:16.5508629Z Traceback (most recent call last): 2025-12-04T13:32:16.5508873Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5509138Z getattr(self, test_name)() 2025-12-04T13:32:16.5509368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5509601Z fn() 2025-12-04T13:32:16.5509808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5510036Z method(*args, **kwargs) 2025-12-04T13:32:16.5510287Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5510520Z method(*args, **kwargs) 2025-12-04T13:32:16.5510741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5510972Z with policy(): 2025-12-04T13:32:16.5511189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5511426Z raise RuntimeError(msg) 2025-12-04T13:32:16.5511799Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 2. CUDA driver allocated memory was 2300575744 and is now 4039114752. 2025-12-04T13:32:16.5512144Z 2025-12-04T13:32:16.5512219Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5512517Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5512742Z 2025-12-04T13:32:16.5512831Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5513017Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:32:16.5513176Z ============================== 1 failed in 10.48s ============================== 2025-12-04T13:32:16.5513309Z Got exit code 1 2025-12-04T13:32:16.5513407Z Retrying single test... 2025-12-04T13:32:16.5513674Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-edff2d15c8bd754c.xml 2025-12-04T13:32:16.5513974Z ============================= test session starts ============================== 2025-12-04T13:32:16.5514188Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:32:16.5514378Z cachedir: .pytest_cache 2025-12-04T13:32:16.5514603Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:32:16.5514840Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:32:16.5514961Z configfile: pytest.ini 2025-12-04T13:32:16.5515194Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:32:16.5515469Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T13:32:16.5515761Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda 2025-12-04T13:32:16.5516019Z Running 1 items in this shard 2025-12-04T13:32:16.5516132Z 2025-12-04T13:32:16.5516395Z distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda I1204 13:31:22.378000 231912 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 231981 2025-12-04T13:32:16.5516848Z I1204 13:31:22.379000 231912 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 231982 2025-12-04T13:32:16.5517190Z I1204 13:31:22.379000 231912 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 231983 2025-12-04T13:32:16.5517561Z I1204 13:31:22.380000 231912 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 231984 2025-12-04T13:32:16.5518248Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5518831Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5519414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5520001Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5520627Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5521210Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5521793Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5522374Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5522763Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:32:16.5523138Z return func(*args, **kwargs) 2025-12-04T13:32:16.5523354Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5523778Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5524278Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5524766Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5525249Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5525741Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5526179Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5526648Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5527141Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5527605Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5528067Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5528520Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5528976Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5529448Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5530071Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 2. CUDA driver allocated memory was 2300575744 and is now 4039114752. 2025-12-04T13:32:16.5530693Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5531043Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5531592Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5532056Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5532426Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5532841Z [rank2]:E1204 13:31:30.447000 231983 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:32:16.5533083Z dist init r=2, world=4 2025-12-04T13:32:16.5533292Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5533631Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5534118Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5534601Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5535113Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5535560Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5535999Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5536488Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5536950Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5537413Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5537874Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5538322Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5538776Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5539239Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5539859Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 3. CUDA driver allocated memory was 2250244096 and is now 3988783104. 2025-12-04T13:32:16.5540476Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5540825Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5541373Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5541836Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5542200Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5542616Z [rank3]:E1204 13:31:30.452000 231984 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:32:16.5542864Z dist init r=3, world=4 2025-12-04T13:32:16.5543069Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5543408Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5543894Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5544398Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5544880Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5545336Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5545807Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5546269Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5546734Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5547197Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5547661Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5548113Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5548566Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5549032Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5549649Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 0. CUDA driver allocated memory was 2453667840 and is now 4192206848. 2025-12-04T13:32:16.5550276Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5550624Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5551170Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5551635Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5552000Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5552414Z [rank0]:E1204 13:31:30.465000 231981 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:32:16.5552655Z dist init r=0, world=4 2025-12-04T13:32:16.5552856Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5553194Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5553715Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5554196Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5554701Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5555148Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5555586Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5556052Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5556512Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5556972Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5557438Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5557886Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5558339Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5558805Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5559426Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 1. CUDA driver allocated memory was 2317352960 and is now 4055891968. 2025-12-04T13:32:16.5560003Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5560392Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5560937Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5561401Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5561773Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5562190Z [rank1]:E1204 13:31:30.492000 231982 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:32:16.5562457Z dist init r=1, world=4 2025-12-04T13:32:16.5562857Z [rank0]:[W1204 13:31:30.229621817 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:32:16.5563269Z FAILED [10.1212s] [100%] 2025-12-04T13:32:16.5563334Z 2025-12-04T13:32:16.5563393Z =================================== FAILURES =================================== 2025-12-04T13:32:16.5563574Z ____________________ TestPureFP16CUDA.test_fp16_dtypes_cuda ____________________ 2025-12-04T13:32:16.5563741Z Traceback (most recent call last): 2025-12-04T13:32:16.5564015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:32:16.5564259Z self._join_processes(fn) 2025-12-04T13:32:16.5564504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:32:16.5564772Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:32:16.5565045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:32:16.5565311Z raise RuntimeError(error) 2025-12-04T13:32:16.5565468Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:32:16.5565635Z Traceback (most recent call last): 2025-12-04T13:32:16.5565880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5566124Z getattr(self, test_name)() 2025-12-04T13:32:16.5566362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5566599Z fn() 2025-12-04T13:32:16.5566805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5567046Z method(*args, **kwargs) 2025-12-04T13:32:16.5567273Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5567507Z method(*args, **kwargs) 2025-12-04T13:32:16.5567731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5567963Z with policy(): 2025-12-04T13:32:16.5568180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5568415Z raise RuntimeError(msg) 2025-12-04T13:32:16.5568789Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 2. CUDA driver allocated memory was 2300575744 and is now 4039114752. 2025-12-04T13:32:16.5569127Z 2025-12-04T13:32:16.5569203Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5569499Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5569720Z 2025-12-04T13:32:16.5569811Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5569936Z 2025-12-04T13:32:16.5569938Z 2025-12-04T13:32:16.5570018Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:32:16.5570246Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:32:16.5570615Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-edff2d15c8bd754c.xml - 2025-12-04T13:32:16.5570957Z =========================== short test summary info ============================ 2025-12-04T13:32:16.5571293Z FAILED [10.1212s] distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:32:16.5571578Z Traceback (most recent call last): 2025-12-04T13:32:16.5571824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5572068Z getattr(self, test_name)() 2025-12-04T13:32:16.5572300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5572534Z fn() 2025-12-04T13:32:16.5572766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5572996Z method(*args, **kwargs) 2025-12-04T13:32:16.5573216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5573449Z method(*args, **kwargs) 2025-12-04T13:32:16.5573667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5573892Z with policy(): 2025-12-04T13:32:16.5574104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5574335Z raise RuntimeError(msg) 2025-12-04T13:32:16.5574713Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 2. CUDA driver allocated memory was 2300575744 and is now 4039114752. 2025-12-04T13:32:16.5575050Z 2025-12-04T13:32:16.5575126Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5575424Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5575647Z 2025-12-04T13:32:16.5575737Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5575926Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:32:16.5576092Z ======================= 1 failed, 1 deselected in 10.28s ======================= 2025-12-04T13:32:16.5576233Z Got exit code 1 2025-12-04T13:32:16.5576330Z Retrying single test... 2025-12-04T13:32:16.5576597Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-99ff6aa407c05e82.xml 2025-12-04T13:32:16.5576895Z ============================= test session starts ============================== 2025-12-04T13:32:16.5577106Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:32:16.5577294Z cachedir: .pytest_cache 2025-12-04T13:32:16.5577517Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:32:16.5577756Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:32:16.5577875Z configfile: pytest.ini 2025-12-04T13:32:16.5578101Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:32:16.5578369Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T13:32:16.5578655Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda 2025-12-04T13:32:16.5578910Z Running 1 items in this shard 2025-12-04T13:32:16.5578984Z 2025-12-04T13:32:16.5579252Z distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda I1204 13:31:35.080000 232314 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 232383 2025-12-04T13:32:16.5579718Z I1204 13:31:35.080000 232314 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 232384 2025-12-04T13:32:16.5580090Z I1204 13:31:35.081000 232314 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 232385 2025-12-04T13:32:16.5580484Z I1204 13:31:35.081000 232314 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 232386 2025-12-04T13:32:16.5581253Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5581843Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5582423Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5583006Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5583585Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5584159Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5584737Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:32:16.5585319Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:32:16.5585705Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:32:16.5586078Z return func(*args, **kwargs) 2025-12-04T13:32:16.5586292Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5586636Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5587124Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5587602Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5588080Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5588529Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5588970Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5589464Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5589932Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5590424Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5591178Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5591628Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5592088Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5592551Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5593170Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 3. CUDA driver allocated memory was 2250244096 and is now 3988783104. 2025-12-04T13:32:16.5593745Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5594095Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5594643Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5595110Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5595479Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5595891Z [rank3]:E1204 13:31:43.072000 232386 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:32:16.5596133Z dist init r=3, world=4 2025-12-04T13:32:16.5596336Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5596673Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5597156Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5597631Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5598180Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5598624Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5599099Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5599564Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5600051Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5600547Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5601012Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5601465Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5601917Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5602381Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5602999Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 0. CUDA driver allocated memory was 2453667840 and is now 4192206848. 2025-12-04T13:32:16.5603578Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5603925Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5604467Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5604928Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5605291Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5605706Z [rank0]:E1204 13:31:43.079000 232383 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:32:16.5605948Z dist init r=0, world=4 2025-12-04T13:32:16.5606150Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5606486Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5606973Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5607451Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5607929Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5608411Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5608850Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5609337Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5609801Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5610296Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5610756Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5611206Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5611660Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5612122Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5612739Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 2. CUDA driver allocated memory was 2300575744 and is now 4039114752. 2025-12-04T13:32:16.5613317Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5613663Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5614209Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5614670Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5615035Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5615447Z [rank2]:E1204 13:31:43.088000 232385 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:32:16.5615686Z dist init r=2, world=4 2025-12-04T13:32:16.5615886Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5616223Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5616709Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5617216Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5617693Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5618141Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5618617Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5619078Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5619543Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5620004Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5620508Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5620959Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5621414Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5621878Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5622495Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 1. CUDA driver allocated memory was 2317352960 and is now 4055891968. 2025-12-04T13:32:16.5623067Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5623416Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5623960Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5624422Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5624786Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5625197Z [rank1]:E1204 13:31:43.124000 232384 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:32:16.5625436Z dist init r=1, world=4 2025-12-04T13:32:16.5625836Z [rank0]:[W1204 13:31:43.748597405 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:32:16.5626274Z FAILED [9.8211s] [100%] 2025-12-04T13:32:16.5626338Z 2025-12-04T13:32:16.5626396Z =================================== FAILURES =================================== 2025-12-04T13:32:16.5626577Z ____________________ TestPureFP16CUDA.test_fp16_dtypes_cuda ____________________ 2025-12-04T13:32:16.5626745Z Traceback (most recent call last): 2025-12-04T13:32:16.5626989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:32:16.5627234Z self._join_processes(fn) 2025-12-04T13:32:16.5627504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:32:16.5627769Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:32:16.5628034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:32:16.5628293Z raise RuntimeError(error) 2025-12-04T13:32:16.5628442Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5628604Z Traceback (most recent call last): 2025-12-04T13:32:16.5628843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5629083Z getattr(self, test_name)() 2025-12-04T13:32:16.5629313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5629544Z fn() 2025-12-04T13:32:16.5629750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5629981Z method(*args, **kwargs) 2025-12-04T13:32:16.5630250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5630480Z method(*args, **kwargs) 2025-12-04T13:32:16.5630700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5630925Z with policy(): 2025-12-04T13:32:16.5631135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5631367Z raise RuntimeError(msg) 2025-12-04T13:32:16.5631741Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 3. CUDA driver allocated memory was 2250244096 and is now 3988783104. 2025-12-04T13:32:16.5632078Z 2025-12-04T13:32:16.5632155Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5632454Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5632678Z 2025-12-04T13:32:16.5632769Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5632894Z 2025-12-04T13:32:16.5632896Z 2025-12-04T13:32:16.5632975Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:32:16.5633176Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:32:16.5633543Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-99ff6aa407c05e82.xml - 2025-12-04T13:32:16.5633883Z =========================== short test summary info ============================ 2025-12-04T13:32:16.5634188Z FAILED [9.8211s] distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5634473Z Traceback (most recent call last): 2025-12-04T13:32:16.5634719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5634998Z getattr(self, test_name)() 2025-12-04T13:32:16.5635230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5635462Z fn() 2025-12-04T13:32:16.5635662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5635893Z method(*args, **kwargs) 2025-12-04T13:32:16.5636116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5636375Z method(*args, **kwargs) 2025-12-04T13:32:16.5636596Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5636820Z with policy(): 2025-12-04T13:32:16.5637031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5637264Z raise RuntimeError(msg) 2025-12-04T13:32:16.5637633Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_fp16_dtypes_cuda! Caching allocator allocated memory was 512 and is now reported as 6656 on device 3. CUDA driver allocated memory was 2250244096 and is now 3988783104. 2025-12-04T13:32:16.5637971Z 2025-12-04T13:32:16.5638048Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5638347Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_fp16_dtypes_cuda 2025-12-04T13:32:16.5638571Z 2025-12-04T13:32:16.5638658Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5638846Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:32:16.5639015Z ======================= 1 failed, 1 deselected in 9.97s ======================== 2025-12-04T13:32:16.5639153Z Got exit code 1 2025-12-04T13:32:16.5639347Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda 2025-12-04T13:32:16.5639644Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:32:16.5640007Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-356c0c0a36f890c5.xml 2025-12-04T13:32:16.5640348Z ============================= test session starts ============================== 2025-12-04T13:32:16.5640561Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:32:16.5640750Z cachedir: .pytest_cache 2025-12-04T13:32:16.5640973Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:32:16.5641214Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:32:16.5641334Z configfile: pytest.ini 2025-12-04T13:32:16.5641560Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:32:16.5641829Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T13:32:16.5641988Z stepcurrent: skipping 1 already run items. 2025-12-04T13:32:16.5642117Z Running 1 items in this shard 2025-12-04T13:32:16.5642186Z 2025-12-04T13:32:16.5642470Z distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda I1204 13:31:47.497000 232716 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 232785 2025-12-04T13:32:16.5642938Z I1204 13:31:47.498000 232716 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 232786 2025-12-04T13:32:16.5643279Z I1204 13:31:47.498000 232716 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 232787 2025-12-04T13:32:16.5643653Z I1204 13:31:47.499000 232716 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 232788 2025-12-04T13:32:16.5644132Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:32:16.5644499Z return func(*args, **kwargs) 2025-12-04T13:32:16.5644710Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5645082Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5645569Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5646047Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5646525Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5647074Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5647518Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5647982Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5648448Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5648911Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5649375Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5649825Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5650314Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5650782Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5651407Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5651994Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5652342Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5652938Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5653412Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5653774Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5654213Z [rank3]:E1204 13:31:52.888000 232788 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:32:16.5654456Z dist init r=3, world=4 2025-12-04T13:32:16.5654660Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5654995Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5655481Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5655958Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5656435Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5656881Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5657319Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5657789Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5658250Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5658713Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5659174Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5659624Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5660076Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5660583Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5661210Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3386900480. 2025-12-04T13:32:16.5661791Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5662167Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5662719Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5663192Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5663580Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5663993Z [rank1]:E1204 13:31:52.889000 232786 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:32:16.5664235Z dist init r=1, world=4 2025-12-04T13:32:16.5664438Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5664775Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5665264Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5665743Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5666217Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5666665Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5667103Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5667563Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5668023Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5668484Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5668945Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5669395Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5669848Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5670355Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5670978Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3370123264. 2025-12-04T13:32:16.5671590Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5671935Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5672515Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5672982Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5673344Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5673758Z [rank2]:E1204 13:31:52.943000 232787 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:32:16.5673997Z dist init r=2, world=4 2025-12-04T13:32:16.5674198Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5674533Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5675018Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5675496Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5675969Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5676410Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5676847Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5677312Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5677772Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5678233Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5678691Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5679138Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5679589Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5680049Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5680744Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2453667840 and is now 3523215360. 2025-12-04T13:32:16.5681330Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5681710Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5682267Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5682740Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5683101Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5683511Z [rank0]:E1204 13:31:52.952000 232785 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:32:16.5683750Z dist init r=0, world=4 2025-12-04T13:32:16.5684150Z [rank0]:[W1204 13:31:53.777553870 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:32:16.5684560Z FAILED [7.0168s] [100%] 2025-12-04T13:32:16.5684623Z 2025-12-04T13:32:16.5684683Z =================================== FAILURES =================================== 2025-12-04T13:32:16.5684865Z ________________ TestPureFP16CUDA.test_pure_fp16_training_cuda _________________ 2025-12-04T13:32:16.5685035Z Traceback (most recent call last): 2025-12-04T13:32:16.5685277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:32:16.5685519Z self._join_processes(fn) 2025-12-04T13:32:16.5685765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:32:16.5686028Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:32:16.5686294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:32:16.5686552Z raise RuntimeError(error) 2025-12-04T13:32:16.5686702Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:32:16.5686865Z Traceback (most recent call last): 2025-12-04T13:32:16.5687105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5687347Z getattr(self, test_name)() 2025-12-04T13:32:16.5687575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5687807Z fn() 2025-12-04T13:32:16.5688008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5688239Z method(*args, **kwargs) 2025-12-04T13:32:16.5688462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5688694Z method(*args, **kwargs) 2025-12-04T13:32:16.5688913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5689180Z with policy(): 2025-12-04T13:32:16.5689392Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5689623Z raise RuntimeError(msg) 2025-12-04T13:32:16.5690003Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3386900480. 2025-12-04T13:32:16.5690379Z 2025-12-04T13:32:16.5690455Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5690787Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5691024Z 2025-12-04T13:32:16.5691114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5691241Z 2025-12-04T13:32:16.5691301Z Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5691439Z Traceback (most recent call last): 2025-12-04T13:32:16.5691683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5691923Z getattr(self, test_name)() 2025-12-04T13:32:16.5692156Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5692384Z fn() 2025-12-04T13:32:16.5692584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5692812Z method(*args, **kwargs) 2025-12-04T13:32:16.5693032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5693258Z method(*args, **kwargs) 2025-12-04T13:32:16.5693485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5693715Z with policy(): 2025-12-04T13:32:16.5693925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5694154Z raise RuntimeError(msg) 2025-12-04T13:32:16.5694533Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5694881Z 2025-12-04T13:32:16.5694953Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5695253Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5695487Z 2025-12-04T13:32:16.5695576Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5695704Z 2025-12-04T13:32:16.5695706Z 2025-12-04T13:32:16.5695787Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:32:16.5695991Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:32:16.5696360Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-356c0c0a36f890c5.xml - 2025-12-04T13:32:16.5696701Z =========================== short test summary info ============================ 2025-12-04T13:32:16.5697016Z FAILED [7.0168s] distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:32:16.5697311Z Traceback (most recent call last): 2025-12-04T13:32:16.5697553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5697829Z getattr(self, test_name)() 2025-12-04T13:32:16.5698060Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5698294Z fn() 2025-12-04T13:32:16.5698493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5704584Z method(*args, **kwargs) 2025-12-04T13:32:16.5704882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5705118Z method(*args, **kwargs) 2025-12-04T13:32:16.5705339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5705563Z with policy(): 2025-12-04T13:32:16.5705780Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5706019Z raise RuntimeError(msg) 2025-12-04T13:32:16.5706406Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3386900480. 2025-12-04T13:32:16.5706757Z 2025-12-04T13:32:16.5706831Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5707140Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5707375Z 2025-12-04T13:32:16.5707462Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5707587Z 2025-12-04T13:32:16.5707645Z Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5707783Z Traceback (most recent call last): 2025-12-04T13:32:16.5708022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5708267Z getattr(self, test_name)() 2025-12-04T13:32:16.5708495Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5708726Z fn() 2025-12-04T13:32:16.5708925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5709154Z method(*args, **kwargs) 2025-12-04T13:32:16.5709375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5709600Z method(*args, **kwargs) 2025-12-04T13:32:16.5709812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5710039Z with policy(): 2025-12-04T13:32:16.5710304Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5710535Z raise RuntimeError(msg) 2025-12-04T13:32:16.5710916Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5711263Z 2025-12-04T13:32:16.5711336Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5711643Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5711877Z 2025-12-04T13:32:16.5711964Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5712191Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:32:16.5712358Z ======================= 1 failed, 1 deselected in 7.17s ======================== 2025-12-04T13:32:16.5712495Z Got exit code 1 2025-12-04T13:32:16.5712592Z Retrying single test... 2025-12-04T13:32:16.5712856Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-495a5aadcffd3655.xml 2025-12-04T13:32:16.5713148Z ============================= test session starts ============================== 2025-12-04T13:32:16.5713388Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:32:16.5713576Z cachedir: .pytest_cache 2025-12-04T13:32:16.5713795Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:32:16.5714037Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:32:16.5714158Z configfile: pytest.ini 2025-12-04T13:32:16.5714381Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:32:16.5714645Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T13:32:16.5714943Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda 2025-12-04T13:32:16.5715207Z Running 1 items in this shard 2025-12-04T13:32:16.5715277Z 2025-12-04T13:32:16.5715559Z distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda I1204 13:31:57.048000 233118 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 233187 2025-12-04T13:32:16.5716023Z I1204 13:31:57.049000 233118 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 233188 2025-12-04T13:32:16.5716368Z I1204 13:31:57.050000 233118 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 233189 2025-12-04T13:32:16.5716706Z I1204 13:31:57.050000 233118 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 233190 2025-12-04T13:32:16.5717184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:32:16.5717548Z return func(*args, **kwargs) 2025-12-04T13:32:16.5717765Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5718101Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5718588Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5719065Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5719541Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5719990Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5720469Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5720930Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5721427Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5721885Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5722382Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5722828Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5723278Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5723743Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5724370Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3370123264. 2025-12-04T13:32:16.5724959Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5725307Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5725864Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5726337Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5726701Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5727116Z [rank2]:E1204 13:32:02.482000 233189 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:32:16.5727355Z dist init r=2, world=4 2025-12-04T13:32:16.5727555Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5727890Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5728375Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5728848Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5729323Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5729767Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5730263Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5730757Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5731215Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5731715Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5732174Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5732621Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5733069Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5733528Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5734153Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5734735Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5735082Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5735637Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5736104Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5736465Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5736876Z [rank3]:E1204 13:32:02.489000 233190 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:32:16.5737117Z dist init r=3, world=4 2025-12-04T13:32:16.5737315Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5737648Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5738127Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5738603Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5739077Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5739552Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5739987Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5740501Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5740992Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5741450Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5741909Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5742354Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5742800Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5743262Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5743885Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3386900480. 2025-12-04T13:32:16.5744469Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5744815Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5745368Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5745835Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5746195Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5746606Z [rank1]:E1204 13:32:02.500000 233188 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:32:16.5746844Z dist init r=1, world=4 2025-12-04T13:32:16.5747041Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5747373Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5747852Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5748326Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5748831Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5749272Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5749737Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5750233Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5750692Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5751154Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5751618Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5752065Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5752516Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5752975Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5753596Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2453667840 and is now 3523215360. 2025-12-04T13:32:16.5754181Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5754531Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5755088Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5755558Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5755919Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5756327Z [rank0]:E1204 13:32:02.556000 233187 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:32:16.5756572Z dist init r=0, world=4 2025-12-04T13:32:16.5756968Z [rank0]:[W1204 13:32:02.291615182 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:32:16.5757374Z FAILED [7.1169s] [100%] 2025-12-04T13:32:16.5757479Z 2025-12-04T13:32:16.5757534Z =================================== FAILURES =================================== 2025-12-04T13:32:16.5757714Z ________________ TestPureFP16CUDA.test_pure_fp16_training_cuda _________________ 2025-12-04T13:32:16.5757878Z Traceback (most recent call last): 2025-12-04T13:32:16.5758117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:32:16.5758356Z self._join_processes(fn) 2025-12-04T13:32:16.5758595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:32:16.5758883Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:32:16.5759148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:32:16.5759402Z raise RuntimeError(error) 2025-12-04T13:32:16.5759548Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:32:16.5759705Z Traceback (most recent call last): 2025-12-04T13:32:16.5759938Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5760201Z getattr(self, test_name)() 2025-12-04T13:32:16.5760426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5760652Z fn() 2025-12-04T13:32:16.5760848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5761077Z method(*args, **kwargs) 2025-12-04T13:32:16.5761293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5761517Z method(*args, **kwargs) 2025-12-04T13:32:16.5761730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5761955Z with policy(): 2025-12-04T13:32:16.5762161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5762388Z raise RuntimeError(msg) 2025-12-04T13:32:16.5762763Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3370123264. 2025-12-04T13:32:16.5763106Z 2025-12-04T13:32:16.5763182Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5763484Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5763718Z 2025-12-04T13:32:16.5763804Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5763928Z 2025-12-04T13:32:16.5763984Z Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5764119Z Traceback (most recent call last): 2025-12-04T13:32:16.5764355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5764594Z getattr(self, test_name)() 2025-12-04T13:32:16.5764819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5765047Z fn() 2025-12-04T13:32:16.5765243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5765465Z method(*args, **kwargs) 2025-12-04T13:32:16.5765678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5765935Z method(*args, **kwargs) 2025-12-04T13:32:16.5766149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5766372Z with policy(): 2025-12-04T13:32:16.5766577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5766805Z raise RuntimeError(msg) 2025-12-04T13:32:16.5767212Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5767559Z 2025-12-04T13:32:16.5767631Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5767928Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5768162Z 2025-12-04T13:32:16.5768246Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5768369Z 2025-12-04T13:32:16.5768370Z 2025-12-04T13:32:16.5768448Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:32:16.5768649Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:32:16.5769018Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-495a5aadcffd3655.xml - 2025-12-04T13:32:16.5769362Z =========================== short test summary info ============================ 2025-12-04T13:32:16.5769668Z FAILED [7.1169s] distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:32:16.5769957Z Traceback (most recent call last): 2025-12-04T13:32:16.5770232Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5770473Z getattr(self, test_name)() 2025-12-04T13:32:16.5770700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5770930Z fn() 2025-12-04T13:32:16.5771127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5771350Z method(*args, **kwargs) 2025-12-04T13:32:16.5771566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5771792Z method(*args, **kwargs) 2025-12-04T13:32:16.5772005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5772229Z with policy(): 2025-12-04T13:32:16.5772434Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5772663Z raise RuntimeError(msg) 2025-12-04T13:32:16.5773036Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3370123264. 2025-12-04T13:32:16.5773381Z 2025-12-04T13:32:16.5773452Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5773754Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5773986Z 2025-12-04T13:32:16.5774071Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5774192Z 2025-12-04T13:32:16.5774282Z Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5774416Z Traceback (most recent call last): 2025-12-04T13:32:16.5774649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5774888Z getattr(self, test_name)() 2025-12-04T13:32:16.5775111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5775336Z fn() 2025-12-04T13:32:16.5775531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5775783Z method(*args, **kwargs) 2025-12-04T13:32:16.5775996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5776219Z method(*args, **kwargs) 2025-12-04T13:32:16.5776431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5776652Z with policy(): 2025-12-04T13:32:16.5776854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5777080Z raise RuntimeError(msg) 2025-12-04T13:32:16.5777452Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5777795Z 2025-12-04T13:32:16.5777869Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5778166Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5778394Z 2025-12-04T13:32:16.5778479Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5778663Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:32:16.5778824Z ======================= 1 failed, 1 deselected in 7.27s ======================== 2025-12-04T13:32:16.5778959Z Got exit code 1 2025-12-04T13:32:16.5779051Z Retrying single test... 2025-12-04T13:32:16.5779310Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-795064c8053ae33b.xml 2025-12-04T13:32:16.5779598Z ============================= test session starts ============================== 2025-12-04T13:32:16.5779807Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:32:16.5779990Z cachedir: .pytest_cache 2025-12-04T13:32:16.5780246Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:32:16.5780481Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:32:16.5780594Z configfile: pytest.ini 2025-12-04T13:32:16.5780815Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:32:16.5781079Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T13:32:16.5781369Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda 2025-12-04T13:32:16.5781632Z Running 1 items in this shard 2025-12-04T13:32:16.5781702Z 2025-12-04T13:32:16.5781981Z distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda I1204 13:32:06.568000 233520 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 233589 2025-12-04T13:32:16.5782439Z I1204 13:32:06.569000 233520 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 233590 2025-12-04T13:32:16.5782807Z I1204 13:32:06.570000 233520 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 233591 2025-12-04T13:32:16.5783143Z I1204 13:32:06.570000 233520 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 233592 2025-12-04T13:32:16.5783622Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:32:16.5783990Z return func(*args, **kwargs) 2025-12-04T13:32:16.5784230Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5784569Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5785056Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5785532Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5786006Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5786454Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5786889Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5787349Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5787810Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5788267Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5788727Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5789171Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5789619Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5790077Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5790739Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5791322Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5791666Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5792255Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5792723Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5793082Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5793521Z [rank3]:E1204 13:32:12.053000 233592 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:32:16.5793759Z dist init r=3, world=4 2025-12-04T13:32:16.5793958Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5794292Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5794771Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5795244Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5795717Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5796159Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5796600Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5797056Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5797512Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5797969Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5798429Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5798876Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5799323Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5799783Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5800438Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3370123264. 2025-12-04T13:32:16.5801049Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5801393Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5801943Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5802433Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5802791Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5803198Z [rank2]:E1204 13:32:12.055000 233591 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:32:16.5803435Z dist init r=2, world=4 2025-12-04T13:32:16.5803633Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5803963Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5804442Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5804917Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5805391Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5805834Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5806267Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5806722Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5807179Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5807635Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5808097Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5808550Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5808998Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5809457Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5810076Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3386900480. 2025-12-04T13:32:16.5810714Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5811058Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5811636Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5812103Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5812461Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5812870Z [rank1]:E1204 13:32:12.093000 233590 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:32:16.5813105Z dist init r=1, world=4 2025-12-04T13:32:16.5813303Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:32:16.5813633Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:32:16.5814113Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5814587Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:32:16.5815060Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5815508Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:32:16.5815943Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5816402Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5816860Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5817316Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:32:16.5817778Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5818227Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:32:16.5818675Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5819160Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:32:16.5819781Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2453667840 and is now 3523215360. 2025-12-04T13:32:16.5820409Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5820784Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5821336Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5821809Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:32:16.5822166Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5822576Z [rank0]:E1204 13:32:12.113000 233589 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:32:16.5822811Z dist init r=0, world=4 2025-12-04T13:32:16.5823209Z [rank0]:[W1204 13:32:12.893685156 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:32:16.5823615Z FAILED [7.1168s] [100%] 2025-12-04T13:32:16.5823677Z 2025-12-04T13:32:16.5823733Z =================================== FAILURES =================================== 2025-12-04T13:32:16.5823913Z ________________ TestPureFP16CUDA.test_pure_fp16_training_cuda _________________ 2025-12-04T13:32:16.5824076Z Traceback (most recent call last): 2025-12-04T13:32:16.5824313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:32:16.5824555Z self._join_processes(fn) 2025-12-04T13:32:16.5824795Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:32:16.5825056Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:32:16.5825319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:32:16.5825574Z raise RuntimeError(error) 2025-12-04T13:32:16.5825721Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5825878Z Traceback (most recent call last): 2025-12-04T13:32:16.5826111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5826349Z getattr(self, test_name)() 2025-12-04T13:32:16.5826575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5826802Z fn() 2025-12-04T13:32:16.5826999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5827227Z method(*args, **kwargs) 2025-12-04T13:32:16.5827445Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5827671Z method(*args, **kwargs) 2025-12-04T13:32:16.5827886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5827956Z with policy(): 2025-12-04T13:32:16.5828108Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5828149Z raise RuntimeError(msg) 2025-12-04T13:32:16.5828459Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5828462Z 2025-12-04T13:32:16.5828556Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5828753Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5828756Z 2025-12-04T13:32:16.5828844Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5828846Z 2025-12-04T13:32:16.5828848Z 2025-12-04T13:32:16.5828925Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:32:16.5829010Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:32:16.5829257Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-795064c8053ae33b.xml - 2025-12-04T13:32:16.5829317Z =========================== short test summary info ============================ 2025-12-04T13:32:16.5829532Z FAILED [7.1168s] distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:32:16.5829578Z Traceback (most recent call last): 2025-12-04T13:32:16.5829741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:32:16.5829787Z getattr(self, test_name)() 2025-12-04T13:32:16.5829945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:32:16.5829981Z fn() 2025-12-04T13:32:16.5830131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5830199Z method(*args, **kwargs) 2025-12-04T13:32:16.5830348Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:32:16.5830391Z method(*args, **kwargs) 2025-12-04T13:32:16.5830539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:32:16.5830576Z with policy(): 2025-12-04T13:32:16.5830726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:32:16.5830768Z raise RuntimeError(msg) 2025-12-04T13:32:16.5831078Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestPureFP16CUDA.test_pure_fp16_training_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 3319791616. 2025-12-04T13:32:16.5831080Z 2025-12-04T13:32:16.5831154Z To execute this test, run the following from the base repo dir: 2025-12-04T13:32:16.5831348Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_pure_fp16.py TestPureFP16CUDA.test_pure_fp16_training_cuda 2025-12-04T13:32:16.5831351Z 2025-12-04T13:32:16.5831437Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:32:16.5831498Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:32:16.5831594Z ======================= 1 failed, 1 deselected in 7.28s ======================== 2025-12-04T13:32:16.5831630Z Got exit code 1 2025-12-04T13:32:16.5831778Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda 2025-12-04T13:32:16.5831904Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:32:16.5832105Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-e9b17c892262ed70.xml 2025-12-04T13:32:16.5832163Z ============================= test session starts ============================== 2025-12-04T13:32:16.5832297Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:32:16.5832339Z cachedir: .pytest_cache 2025-12-04T13:32:16.5832497Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:32:16.5832545Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:32:16.5832585Z configfile: pytest.ini 2025-12-04T13:32:16.5832746Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:32:16.5832817Z collecting ... collected 2 items / 2 deselected / 0 selected 2025-12-04T13:32:16.5832870Z stepcurrent: skipping 2 already run items. 2025-12-04T13:32:16.5832912Z Running 0 items in this shard 2025-12-04T13:32:16.5832914Z 2025-12-04T13:32:16.5833159Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_pure_fp16/distributed.fsdp.test_fsdp_pure_fp16-e9b17c892262ed70.xml - 2025-12-04T13:32:16.5833217Z ============================ 2 deselected in 0.00s ============================= 2025-12-04T13:32:16.5833510Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_fp16_dtypes_cuda', 'test/distributed/fsdp/test_fsdp_pure_fp16.py::TestPureFP16CUDA::test_pure_fp16_training_cuda'] 2025-12-04T13:32:16.5833514Z 2025-12-04T13:32:16.5833706Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 1/1 (test/test-reports/distributed.fsdp.test_fsdp_pure_fp16_1.1_af0f579fa03b5e35_.log) 2025-12-04T13:32:16.5833709Z 2025-12-04T13:32:16.5833837Z Finished distributed/fsdp/test_fsdp_pure_fp16 1/1 ... [2025-12-04 13:32:16.542132][2240247.032110015], took 1.15min 2025-12-04T13:32:16.5834099Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:32:16.5834186Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:32:16.5834280Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:32:16.5834327Z Uploading artifacts took 0.00 seconds 2025-12-04T13:32:16.5834385Z distributed/fsdp/test_fsdp_pure_fp16 1/1 failed! 2025-12-04T13:32:16.5834506Z Running distributed/checkpoint/test_checkpoint 1/1 ... [2025-12-04 13:32:16.545449][2240247.035429108] 2025-12-04T13:32:16.5834554Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:32:16.5834881Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:32:16.545632] 2025-12-04T13:32:50.8661736Z 2025-12-04T13:32:50.8664947Z distributed/checkpoint/test_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_checkpoint_1.1_d7eadd9c253b27ac_.log 2025-12-04T13:32:50.8666520Z Running 8 items in this shard: test/distributed/checkpoint/test_checkpoint.py::TestDistributedCheckpointing::test_default_metadata, test/distributed/checkpoint/test_checkpoint.py::TestDistributedCheckpointing::test_tensor_metadata_with_missing_rank_spec, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_dummy_reader_works, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_dummy_writer_works, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_load_error_handling, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_load_error_handling_no_dist, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_save_error_handling, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_save_error_handling_no_dist 2025-12-04T13:32:50.8668298Z 2025-12-04T13:32:50.8668563Z Finished distributed/checkpoint/test_checkpoint 1/1 ... [2025-12-04 13:32:50.865890][2240281.355866245], took 0.57min 2025-12-04T13:32:50.8678736Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:32:50.8695049Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:32:50.8697170Z Running distributed/fsdp/test_fsdp_apply 1/1 ... [2025-12-04 13:32:50.869611][2240281.359591962] 2025-12-04T13:32:50.8697385Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:32:50.8699377Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_apply.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:32:50.869823] 2025-12-04T13:34:05.5405491Z 2025-12-04T13:34:05.5406816Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_apply 1/1 (test/test-reports/distributed.fsdp.test_fsdp_apply_1.1_f5676752440bc7db_.log) 2025-12-04T13:34:05.5408025Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-b4be4d1d7bd01e29.xml 2025-12-04T13:34:05.5408964Z ============================= test session starts ============================== 2025-12-04T13:34:05.5409585Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5410139Z cachedir: .pytest_cache 2025-12-04T13:34:05.5410883Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5411570Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5411919Z configfile: pytest.ini 2025-12-04T13:34:05.5412558Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5413555Z collecting ... collected 3 items 2025-12-04T13:34:05.5413935Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T13:34:05.5415332Z Running 3 items in this shard: test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda, test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda, test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda 2025-12-04T13:34:05.5416181Z 2025-12-04T13:34:05.5416623Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda I1204 13:32:52.666000 236235 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 236304 2025-12-04T13:34:05.5417339Z I1204 13:32:52.667000 236235 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 236305 2025-12-04T13:34:05.5418195Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5418875Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5419762Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5421405Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5422240Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5422922Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5423802Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5424689Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5424912Z File "", line 1, in 2025-12-04T13:34:05.5425214Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2025-12-04T13:34:05.5425524Z exitcode = _main(fd, parent_sentinel) 2025-12-04T13:34:05.5425811Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2025-12-04T13:34:05.5426109Z return self._bootstrap(parent_sentinel) 2025-12-04T13:34:05.5426348Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap 2025-12-04T13:34:05.5426579Z self.run() 2025-12-04T13:34:05.5426772Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/process.py", line 108, in run 2025-12-04T13:34:05.5427003Z self._target(*self._args, **self._kwargs) 2025-12-04T13:34:05.5427283Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1272, in _run 2025-12-04T13:34:05.5427556Z self.run_test(test_name, pipe) 2025-12-04T13:34:05.5427845Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5428129Z getattr(self, test_name)() 2025-12-04T13:34:05.5428415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5428692Z fn() 2025-12-04T13:34:05.5428933Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5429204Z method(*args, **kwargs) 2025-12-04T13:34:05.5429467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5429746Z method(*args, **kwargs) 2025-12-04T13:34:05.5430004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5430318Z method(*args, **kwargs) 2025-12-04T13:34:05.5430611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 428, in instantiated_test 2025-12-04T13:34:05.5430921Z result = test(self, **param_kwargs) 2025-12-04T13:34:05.5431204Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 227, in wrapper 2025-12-04T13:34:05.5431493Z return func(*args, **kwargs) 2025-12-04T13:34:05.5431771Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_apply.py", line 113, in test_apply_in_summon_raises_error 2025-12-04T13:34:05.5432075Z transformer.apply(self._init_linear_weights) 2025-12-04T13:34:05.5432456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 586, in apply 2025-12-04T13:34:05.5432763Z self._assert_state(TrainingState.IDLE) 2025-12-04T13:34:05.5433078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1028, in _assert_state 2025-12-04T13:34:05.5433388Z traceback.print_stack() 2025-12-04T13:34:05.5433643Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5434090Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5434669Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5435231Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5435714Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5436165Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5436619Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5437088Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5437561Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5438026Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5438493Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5438950Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5439406Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5439881Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5440567Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2099249152. 2025-12-04T13:34:05.5441169Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5441521Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5442091Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5442622Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5442988Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5443412Z [rank1]:E1204 13:32:56.469000 236305 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5443858Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5444201Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5444692Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5445174Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5445659Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5446108Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5446546Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5447017Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5447480Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5447943Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5448415Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5448867Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5449327Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5449797Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5450483Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5451079Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5451433Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5452049Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5452536Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5452958Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5453376Z [rank0]:E1204 13:32:56.470000 236304 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5453622Z dist init r=1, world=2 2025-12-04T13:34:05.5453734Z dist init r=0, world=2 2025-12-04T13:34:05.5453867Z Asserting FSDP instance is: FullyShardedDataParallel( 2025-12-04T13:34:05.5454038Z (_fsdp_wrapped_module): TransformerWithSharedParams( 2025-12-04T13:34:05.5454189Z (embed_tokens): Embedding(23, 16) 2025-12-04T13:34:05.5454314Z (transformer): Transformer( 2025-12-04T13:34:05.5454438Z (encoder): TransformerEncoder( 2025-12-04T13:34:05.5454564Z (layers): ModuleList( 2025-12-04T13:34:05.5454688Z (0-1): 2 x FullyShardedDataParallel( 2025-12-04T13:34:05.5454840Z (_fsdp_wrapped_module): TransformerEncoderLayer( 2025-12-04T13:34:05.5454990Z (self_attn): MultiheadAttention( 2025-12-04T13:34:05.5455194Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5455384Z ) 2025-12-04T13:34:05.5455511Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2025-12-04T13:34:05.5455677Z (dropout): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5455840Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2025-12-04T13:34:05.5456021Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5456199Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5456365Z (dropout1): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5456510Z (dropout2): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5456631Z ) 2025-12-04T13:34:05.5456720Z ) 2025-12-04T13:34:05.5456807Z ) 2025-12-04T13:34:05.5456934Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5457074Z ) 2025-12-04T13:34:05.5457169Z (decoder): TransformerDecoder( 2025-12-04T13:34:05.5457293Z (layers): ModuleList( 2025-12-04T13:34:05.5457416Z (0-1): 2 x FullyShardedDataParallel( 2025-12-04T13:34:05.5457565Z (_fsdp_wrapped_module): TransformerDecoderLayer( 2025-12-04T13:34:05.5457716Z (self_attn): MultiheadAttention( 2025-12-04T13:34:05.5457911Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5458097Z ) 2025-12-04T13:34:05.5458204Z (multihead_attn): MultiheadAttention( 2025-12-04T13:34:05.5458404Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5458587Z ) 2025-12-04T13:34:05.5458711Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2025-12-04T13:34:05.5458877Z (dropout): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5459035Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2025-12-04T13:34:05.5459214Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5459390Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5459606Z (norm3): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5459766Z (dropout1): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5459908Z (dropout2): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5460047Z (dropout3): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5460205Z ) 2025-12-04T13:34:05.5460296Z ) 2025-12-04T13:34:05.5460383Z ) 2025-12-04T13:34:05.5460501Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5460641Z ) 2025-12-04T13:34:05.5460725Z ) 2025-12-04T13:34:05.5460887Z (output_proj): Linear(in_features=16, out_features=23, bias=True) 2025-12-04T13:34:05.5461096Z (bn): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 2025-12-04T13:34:05.5461262Z ) 2025-12-04T13:34:05.5461347Z ) 2025-12-04T13:34:05.5461531Z ERROR: expected to be in states [] but current state is TrainingState.SUMMON_FULL_PARAMS 2025-12-04T13:34:05.5462075Z [rank0]:[W1204 13:32:56.155309103 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5462490Z FAILED [5.2129s] [ 33%] 2025-12-04T13:34:05.5462556Z 2025-12-04T13:34:05.5462615Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5462804Z _____________ TestApplyCUDA.test_apply_in_summon_raises_error_cuda _____________ 2025-12-04T13:34:05.5462981Z Traceback (most recent call last): 2025-12-04T13:34:05.5463231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5463489Z self._join_processes(fn) 2025-12-04T13:34:05.5463742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5464016Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5464289Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5464555Z raise RuntimeError(error) 2025-12-04T13:34:05.5464713Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5464878Z Traceback (most recent call last): 2025-12-04T13:34:05.5465122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5465366Z getattr(self, test_name)() 2025-12-04T13:34:05.5465597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5465830Z fn() 2025-12-04T13:34:05.5466035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5466265Z method(*args, **kwargs) 2025-12-04T13:34:05.5466487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5466718Z method(*args, **kwargs) 2025-12-04T13:34:05.5466935Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5467165Z with policy(): 2025-12-04T13:34:05.5467382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5467620Z raise RuntimeError(msg) 2025-12-04T13:34:05.5468021Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5468422Z 2025-12-04T13:34:05.5468497Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5468824Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5469071Z 2025-12-04T13:34:05.5469161Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5469292Z 2025-12-04T13:34:05.5469352Z Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5469499Z Traceback (most recent call last): 2025-12-04T13:34:05.5469773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5470022Z getattr(self, test_name)() 2025-12-04T13:34:05.5470303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5470653Z fn() 2025-12-04T13:34:05.5471102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5471379Z method(*args, **kwargs) 2025-12-04T13:34:05.5471642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5471929Z method(*args, **kwargs) 2025-12-04T13:34:05.5483433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5483685Z with policy(): 2025-12-04T13:34:05.5483916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5484160Z raise RuntimeError(msg) 2025-12-04T13:34:05.5484577Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2099249152. 2025-12-04T13:34:05.5484951Z 2025-12-04T13:34:05.5485029Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5485357Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5485605Z 2025-12-04T13:34:05.5485697Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5485831Z 2025-12-04T13:34:05.5485833Z 2025-12-04T13:34:05.5485922Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5486137Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5486511Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-b4be4d1d7bd01e29.xml - 2025-12-04T13:34:05.5486860Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5487193Z FAILED [5.2129s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5487504Z Traceback (most recent call last): 2025-12-04T13:34:05.5487758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5488008Z getattr(self, test_name)() 2025-12-04T13:34:05.5488249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5488491Z fn() 2025-12-04T13:34:05.5488703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5489015Z method(*args, **kwargs) 2025-12-04T13:34:05.5489242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5489477Z method(*args, **kwargs) 2025-12-04T13:34:05.5489702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5489936Z with policy(): 2025-12-04T13:34:05.5490154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5490435Z raise RuntimeError(msg) 2025-12-04T13:34:05.5490867Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5491239Z 2025-12-04T13:34:05.5491317Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5491640Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5491888Z 2025-12-04T13:34:05.5491978Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5492109Z 2025-12-04T13:34:05.5492170Z Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5492319Z Traceback (most recent call last): 2025-12-04T13:34:05.5492572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5492822Z getattr(self, test_name)() 2025-12-04T13:34:05.5493059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5493306Z fn() 2025-12-04T13:34:05.5493514Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5493751Z method(*args, **kwargs) 2025-12-04T13:34:05.5493980Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5494215Z method(*args, **kwargs) 2025-12-04T13:34:05.5494440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5494675Z with policy(): 2025-12-04T13:34:05.5494900Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5495140Z raise RuntimeError(msg) 2025-12-04T13:34:05.5495537Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2099249152. 2025-12-04T13:34:05.5495902Z 2025-12-04T13:34:05.5495980Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5496301Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5496546Z 2025-12-04T13:34:05.5496636Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5496832Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5497002Z ============================== 1 failed in 5.37s =============================== 2025-12-04T13:34:05.5497143Z Got exit code 1 2025-12-04T13:34:05.5497253Z Retrying single test... 2025-12-04T13:34:05.5497519Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-3543875e50638934.xml 2025-12-04T13:34:05.5497847Z ============================= test session starts ============================== 2025-12-04T13:34:05.5498070Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5498269Z cachedir: .pytest_cache 2025-12-04T13:34:05.5498506Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5498753Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5498880Z configfile: pytest.ini 2025-12-04T13:34:05.5499153Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5499431Z collecting ... collected 3 items / 2 deselected / 1 selected 2025-12-04T13:34:05.5499740Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5500026Z Running 1 items in this shard 2025-12-04T13:34:05.5500107Z 2025-12-04T13:34:05.5500425Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda I1204 13:33:00.233000 236463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 236532 2025-12-04T13:34:05.5500903Z I1204 13:33:00.234000 236463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 236533 2025-12-04T13:34:05.5501466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5501923Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5502365Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5502814Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5503396Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5503991Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5504576Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5505173Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5505314Z File "", line 1, in 2025-12-04T13:34:05.5505516Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2025-12-04T13:34:05.5505719Z exitcode = _main(fd, parent_sentinel) 2025-12-04T13:34:05.5505911Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2025-12-04T13:34:05.5506108Z return self._bootstrap(parent_sentinel) 2025-12-04T13:34:05.5506315Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap 2025-12-04T13:34:05.5506512Z self.run() 2025-12-04T13:34:05.5506673Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/process.py", line 108, in run 2025-12-04T13:34:05.5506866Z self._target(*self._args, **self._kwargs) 2025-12-04T13:34:05.5507158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1272, in _run 2025-12-04T13:34:05.5507391Z self.run_test(test_name, pipe) 2025-12-04T13:34:05.5507635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5507882Z getattr(self, test_name)() 2025-12-04T13:34:05.5508115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5508350Z fn() 2025-12-04T13:34:05.5508589Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5508826Z method(*args, **kwargs) 2025-12-04T13:34:05.5509049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5509286Z method(*args, **kwargs) 2025-12-04T13:34:05.5509507Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5509741Z method(*args, **kwargs) 2025-12-04T13:34:05.5509988Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 428, in instantiated_test 2025-12-04T13:34:05.5510312Z result = test(self, **param_kwargs) 2025-12-04T13:34:05.5510557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 227, in wrapper 2025-12-04T13:34:05.5510799Z return func(*args, **kwargs) 2025-12-04T13:34:05.5511049Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_apply.py", line 113, in test_apply_in_summon_raises_error 2025-12-04T13:34:05.5511308Z transformer.apply(self._init_linear_weights) 2025-12-04T13:34:05.5511576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 586, in apply 2025-12-04T13:34:05.5511841Z self._assert_state(TrainingState.IDLE) 2025-12-04T13:34:05.5512113Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1028, in _assert_state 2025-12-04T13:34:05.5512377Z traceback.print_stack() 2025-12-04T13:34:05.5512593Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5512938Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5513433Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5513917Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5514401Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5514854Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5515301Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5515768Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5516235Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5516748Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5517226Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5517712Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5518170Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5518642Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5519283Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5519885Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5520281Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5520851Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5521336Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5521702Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5522120Z [rank0]:E1204 13:33:04.073000 236532 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5522364Z dist init r=0, world=2 2025-12-04T13:34:05.5522503Z Asserting FSDP instance is: FullyShardedDataParallel( 2025-12-04T13:34:05.5522674Z (_fsdp_wrapped_module): TransformerWithSharedParams( 2025-12-04T13:34:05.5522823Z (embed_tokens): Embedding(23, 16) 2025-12-04T13:34:05.5522949Z (transformer): Transformer( 2025-12-04T13:34:05.5523070Z (encoder): TransformerEncoder( 2025-12-04T13:34:05.5523198Z (layers): ModuleList( 2025-12-04T13:34:05.5523323Z (0-1): 2 x FullyShardedDataParallel( 2025-12-04T13:34:05.5523475Z (_fsdp_wrapped_module): TransformerEncoderLayer( 2025-12-04T13:34:05.5523623Z (self_attn): MultiheadAttention( 2025-12-04T13:34:05.5523826Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5524013Z ) 2025-12-04T13:34:05.5524139Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2025-12-04T13:34:05.5524307Z (dropout): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5524474Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2025-12-04T13:34:05.5524655Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5524840Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5525045Z (dropout1): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5525193Z (dropout2): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5525323Z ) 2025-12-04T13:34:05.5525418Z ) 2025-12-04T13:34:05.5525505Z ) 2025-12-04T13:34:05.5525621Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5525756Z ) 2025-12-04T13:34:05.5525848Z (decoder): TransformerDecoder( 2025-12-04T13:34:05.5525966Z (layers): ModuleList( 2025-12-04T13:34:05.5526086Z (0-1): 2 x FullyShardedDataParallel( 2025-12-04T13:34:05.5526229Z (_fsdp_wrapped_module): TransformerDecoderLayer( 2025-12-04T13:34:05.5526402Z (self_attn): MultiheadAttention( 2025-12-04T13:34:05.5526603Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5526793Z ) 2025-12-04T13:34:05.5526904Z (multihead_attn): MultiheadAttention( 2025-12-04T13:34:05.5527114Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5527301Z ) 2025-12-04T13:34:05.5527430Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2025-12-04T13:34:05.5527593Z (dropout): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5527745Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2025-12-04T13:34:05.5527924Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5528094Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5528268Z (norm3): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5528424Z (dropout1): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5528559Z (dropout2): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5528691Z (dropout3): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5528811Z ) 2025-12-04T13:34:05.5528893Z ) 2025-12-04T13:34:05.5528974Z ) 2025-12-04T13:34:05.5529088Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5529221Z ) 2025-12-04T13:34:05.5529301Z ) 2025-12-04T13:34:05.5529420Z (output_proj): Linear(in_features=16, out_features=23, bias=True) 2025-12-04T13:34:05.5529620Z (bn): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 2025-12-04T13:34:05.5529783Z ) 2025-12-04T13:34:05.5529861Z ) 2025-12-04T13:34:05.5530040Z ERROR: expected to be in states [] but current state is TrainingState.SUMMON_FULL_PARAMS 2025-12-04T13:34:05.5530388Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5530723Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5531212Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5531690Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5532168Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5532615Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5533053Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5533559Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5534023Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5534479Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5534976Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5535423Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5535880Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5536343Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5536983Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2099249152. 2025-12-04T13:34:05.5537574Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5537923Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5538485Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5538960Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5539325Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5539737Z [rank1]:E1204 13:33:04.109000 236533 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5539979Z dist init r=1, world=2 2025-12-04T13:34:05.5540420Z [rank0]:[W1204 13:33:04.758721097 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5540825Z FAILED [5.3122s] [100%] 2025-12-04T13:34:05.5540889Z 2025-12-04T13:34:05.5540947Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5541132Z _____________ TestApplyCUDA.test_apply_in_summon_raises_error_cuda _____________ 2025-12-04T13:34:05.5541299Z Traceback (most recent call last): 2025-12-04T13:34:05.5541544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5541785Z self._join_processes(fn) 2025-12-04T13:34:05.5542029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5542325Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5542590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5542847Z raise RuntimeError(error) 2025-12-04T13:34:05.5542996Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5543155Z Traceback (most recent call last): 2025-12-04T13:34:05.5543391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5543628Z getattr(self, test_name)() 2025-12-04T13:34:05.5543887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5544115Z fn() 2025-12-04T13:34:05.5544314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5544549Z method(*args, **kwargs) 2025-12-04T13:34:05.5544765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5544995Z method(*args, **kwargs) 2025-12-04T13:34:05.5545209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5545436Z with policy(): 2025-12-04T13:34:05.5545641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5545870Z raise RuntimeError(msg) 2025-12-04T13:34:05.5546257Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5546615Z 2025-12-04T13:34:05.5546688Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5547004Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5547243Z 2025-12-04T13:34:05.5547330Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5547453Z 2025-12-04T13:34:05.5547455Z 2025-12-04T13:34:05.5547533Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5547731Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5548088Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-3543875e50638934.xml - 2025-12-04T13:34:05.5548414Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5548736Z FAILED [5.3122s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5549033Z Traceback (most recent call last): 2025-12-04T13:34:05.5549272Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5549512Z getattr(self, test_name)() 2025-12-04T13:34:05.5549738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5549965Z fn() 2025-12-04T13:34:05.5550164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5550429Z method(*args, **kwargs) 2025-12-04T13:34:05.5550644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5550907Z method(*args, **kwargs) 2025-12-04T13:34:05.5551121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5551343Z with policy(): 2025-12-04T13:34:05.5551549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5551776Z raise RuntimeError(msg) 2025-12-04T13:34:05.5552196Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5552551Z 2025-12-04T13:34:05.5552625Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5552933Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5553182Z 2025-12-04T13:34:05.5553267Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5553451Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5553614Z ======================= 1 failed, 2 deselected in 5.47s ======================== 2025-12-04T13:34:05.5553749Z Got exit code 1 2025-12-04T13:34:05.5553842Z Retrying single test... 2025-12-04T13:34:05.5554096Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-a5eb593ccaa06d8d.xml 2025-12-04T13:34:05.5554382Z ============================= test session starts ============================== 2025-12-04T13:34:05.5554591Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5554778Z cachedir: .pytest_cache 2025-12-04T13:34:05.5554999Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5555238Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5555353Z configfile: pytest.ini 2025-12-04T13:34:05.5555574Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5555841Z collecting ... collected 3 items / 2 deselected / 1 selected 2025-12-04T13:34:05.5556140Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5556415Z Running 1 items in this shard 2025-12-04T13:34:05.5556487Z 2025-12-04T13:34:05.5556770Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda I1204 13:33:07.926000 236691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 236760 2025-12-04T13:34:05.5557237Z I1204 13:33:07.927000 236691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 236761 2025-12-04T13:34:05.5557781Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5558215Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5558791Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5559375Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5559852Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5560356Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5560951Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5561532Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5561669Z File "", line 1, in 2025-12-04T13:34:05.5561861Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2025-12-04T13:34:05.5562059Z exitcode = _main(fd, parent_sentinel) 2025-12-04T13:34:05.5562244Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2025-12-04T13:34:05.5562433Z return self._bootstrap(parent_sentinel) 2025-12-04T13:34:05.5562630Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap 2025-12-04T13:34:05.5562817Z self.run() 2025-12-04T13:34:05.5562973Z File "/opt/conda/envs/py_3.10/lib/python3.10/multiprocessing/process.py", line 108, in run 2025-12-04T13:34:05.5563165Z self._target(*self._args, **self._kwargs) 2025-12-04T13:34:05.5563394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1272, in _run 2025-12-04T13:34:05.5563621Z self.run_test(test_name, pipe) 2025-12-04T13:34:05.5563856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5564098Z getattr(self, test_name)() 2025-12-04T13:34:05.5564324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5564551Z fn() 2025-12-04T13:34:05.5564747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5564973Z method(*args, **kwargs) 2025-12-04T13:34:05.5565189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5565413Z method(*args, **kwargs) 2025-12-04T13:34:05.5565633Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5565856Z method(*args, **kwargs) 2025-12-04T13:34:05.5566092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 428, in instantiated_test 2025-12-04T13:34:05.5566348Z result = test(self, **param_kwargs) 2025-12-04T13:34:05.5566584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 227, in wrapper 2025-12-04T13:34:05.5566819Z return func(*args, **kwargs) 2025-12-04T13:34:05.5567055Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_apply.py", line 113, in test_apply_in_summon_raises_error 2025-12-04T13:34:05.5567306Z transformer.apply(self._init_linear_weights) 2025-12-04T13:34:05.5567571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 586, in apply 2025-12-04T13:34:05.5567830Z self._assert_state(TrainingState.IDLE) 2025-12-04T13:34:05.5568103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1028, in _assert_state 2025-12-04T13:34:05.5568412Z traceback.print_stack() 2025-12-04T13:34:05.5568632Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5568978Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5569471Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5569977Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5570495Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5570955Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5571398Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5571865Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5572335Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5572802Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5573270Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5573723Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5574180Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5574651Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5575292Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5575893Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5576244Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5576817Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5577299Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5577666Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5578118Z [rank0]:E1204 13:33:11.785000 236760 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5578364Z dist init r=0, world=2 2025-12-04T13:34:05.5578500Z Asserting FSDP instance is: FullyShardedDataParallel( 2025-12-04T13:34:05.5578668Z (_fsdp_wrapped_module): TransformerWithSharedParams( 2025-12-04T13:34:05.5578817Z (embed_tokens): Embedding(23, 16) 2025-12-04T13:34:05.5578944Z (transformer): Transformer( 2025-12-04T13:34:05.5579066Z (encoder): TransformerEncoder( 2025-12-04T13:34:05.5579223Z (layers): ModuleList( 2025-12-04T13:34:05.5579350Z (0-1): 2 x FullyShardedDataParallel( 2025-12-04T13:34:05.5579503Z (_fsdp_wrapped_module): TransformerEncoderLayer( 2025-12-04T13:34:05.5579655Z (self_attn): MultiheadAttention( 2025-12-04T13:34:05.5579858Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5580049Z ) 2025-12-04T13:34:05.5580215Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2025-12-04T13:34:05.5580380Z (dropout): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5580541Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2025-12-04T13:34:05.5580723Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5580902Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5581066Z (dropout1): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5581212Z (dropout2): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5581338Z ) 2025-12-04T13:34:05.5581428Z ) 2025-12-04T13:34:05.5581515Z ) 2025-12-04T13:34:05.5581634Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5581783Z ) 2025-12-04T13:34:05.5581880Z (decoder): TransformerDecoder( 2025-12-04T13:34:05.5582005Z (layers): ModuleList( 2025-12-04T13:34:05.5582131Z (0-1): 2 x FullyShardedDataParallel( 2025-12-04T13:34:05.5582283Z (_fsdp_wrapped_module): TransformerDecoderLayer( 2025-12-04T13:34:05.5582434Z (self_attn): MultiheadAttention( 2025-12-04T13:34:05.5582631Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5582816Z ) 2025-12-04T13:34:05.5582927Z (multihead_attn): MultiheadAttention( 2025-12-04T13:34:05.5583129Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2025-12-04T13:34:05.5583314Z ) 2025-12-04T13:34:05.5583439Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2025-12-04T13:34:05.5583600Z (dropout): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5583760Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2025-12-04T13:34:05.5583939Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5584116Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5584292Z (norm3): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5584454Z (dropout1): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5584598Z (dropout2): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5584738Z (dropout3): Dropout(p=0.1, inplace=False) 2025-12-04T13:34:05.5584863Z ) 2025-12-04T13:34:05.5584955Z ) 2025-12-04T13:34:05.5585042Z ) 2025-12-04T13:34:05.5585160Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2025-12-04T13:34:05.5585302Z ) 2025-12-04T13:34:05.5585386Z ) 2025-12-04T13:34:05.5585509Z (output_proj): Linear(in_features=16, out_features=23, bias=True) 2025-12-04T13:34:05.5585754Z (bn): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 2025-12-04T13:34:05.5585920Z ) 2025-12-04T13:34:05.5586004Z ) 2025-12-04T13:34:05.5586185Z ERROR: expected to be in states [] but current state is TrainingState.SUMMON_FULL_PARAMS 2025-12-04T13:34:05.5586497Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5586843Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5587370Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5587856Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5588346Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5588799Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5589246Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5589722Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5590251Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5590721Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5591186Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5591646Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5592109Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5592576Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5593216Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2099249152. 2025-12-04T13:34:05.5593814Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5594170Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5594738Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5595252Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5595621Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5596039Z [rank1]:E1204 13:33:11.792000 236761 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5596282Z dist init r=1, world=2 2025-12-04T13:34:05.5596772Z [rank0]:[W1204 13:33:11.449644422 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5597187Z FAILED [5.2121s] [100%] 2025-12-04T13:34:05.5597254Z 2025-12-04T13:34:05.5597320Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5597510Z _____________ TestApplyCUDA.test_apply_in_summon_raises_error_cuda _____________ 2025-12-04T13:34:05.5597683Z Traceback (most recent call last): 2025-12-04T13:34:05.5597930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5598177Z self._join_processes(fn) 2025-12-04T13:34:05.5598421Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5598690Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5598956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5599217Z raise RuntimeError(error) 2025-12-04T13:34:05.5599375Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5599537Z Traceback (most recent call last): 2025-12-04T13:34:05.5599778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5600020Z getattr(self, test_name)() 2025-12-04T13:34:05.5600295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5600529Z fn() 2025-12-04T13:34:05.5600734Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5600972Z method(*args, **kwargs) 2025-12-04T13:34:05.5601196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5601429Z method(*args, **kwargs) 2025-12-04T13:34:05.5601649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5601883Z with policy(): 2025-12-04T13:34:05.5602097Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5602331Z raise RuntimeError(msg) 2025-12-04T13:34:05.5602724Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5603082Z 2025-12-04T13:34:05.5603158Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5603479Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5603724Z 2025-12-04T13:34:05.5603871Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5604002Z 2025-12-04T13:34:05.5604004Z 2025-12-04T13:34:05.5604082Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5604286Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5604654Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-a5eb593ccaa06d8d.xml - 2025-12-04T13:34:05.5604988Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5605347Z FAILED [5.2121s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5605652Z Traceback (most recent call last): 2025-12-04T13:34:05.5605897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5606148Z getattr(self, test_name)() 2025-12-04T13:34:05.5606380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5606617Z fn() 2025-12-04T13:34:05.5606821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5607055Z method(*args, **kwargs) 2025-12-04T13:34:05.5607277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5607511Z method(*args, **kwargs) 2025-12-04T13:34:05.5607732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5607959Z with policy(): 2025-12-04T13:34:05.5608173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5608413Z raise RuntimeError(msg) 2025-12-04T13:34:05.5608804Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_apply_in_summon_raises_error_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2252341248. 2025-12-04T13:34:05.5609163Z 2025-12-04T13:34:05.5609237Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5609554Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5609797Z 2025-12-04T13:34:05.5609889Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5610081Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5610295Z ======================= 1 failed, 2 deselected in 5.37s ======================== 2025-12-04T13:34:05.5610438Z Got exit code 1 2025-12-04T13:34:05.5610651Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda 2025-12-04T13:34:05.5610968Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:34:05.5611328Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-c070250cf4bdd18d.xml 2025-12-04T13:34:05.5611619Z ============================= test session starts ============================== 2025-12-04T13:34:05.5611837Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5612029Z cachedir: .pytest_cache 2025-12-04T13:34:05.5612255Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5612543Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5612665Z configfile: pytest.ini 2025-12-04T13:34:05.5612891Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5613162Z collecting ... collected 3 items / 1 deselected / 2 selected 2025-12-04T13:34:05.5613318Z stepcurrent: skipping 1 already run items. 2025-12-04T13:34:05.5613454Z Running 2 items in this shard 2025-12-04T13:34:05.5613531Z 2025-12-04T13:34:05.5613833Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda I1204 13:33:15.546000 236919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 236988 2025-12-04T13:34:05.5614300Z I1204 13:33:15.547000 236919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 236989 2025-12-04T13:34:05.5614996Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5615597Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5616185Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5616767Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5617009Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5617358Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5617856Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5618346Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5618836Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5619286Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5619737Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5620243Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5620707Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5621172Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5621637Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5622125Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5622580Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5623046Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5623703Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5624293Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5624645Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5625201Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5625673Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5626038Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5626455Z [rank0]:E1204 13:33:19.832000 236988 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5626701Z dist init r=0, world=2 2025-12-04T13:34:05.5626908Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5627247Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5627735Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5628218Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5628696Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5629147Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5629591Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5630056Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5630561Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5631025Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5631526Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5631982Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5632475Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5632940Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5633565Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5634152Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5634501Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5635054Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5635525Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5635893Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5636306Z [rank1]:E1204 13:33:19.837000 236989 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5636549Z dist init r=1, world=2 2025-12-04T13:34:05.5636951Z [rank0]:[W1204 13:33:20.503662187 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5637365Z FAILED [5.7125s] [ 50%] 2025-12-04T13:34:05.5637431Z 2025-12-04T13:34:05.5637488Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5637671Z _________________ TestApplyCUDA.test_nested_module_apply_cuda __________________ 2025-12-04T13:34:05.5637844Z Traceback (most recent call last): 2025-12-04T13:34:05.5638089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5638334Z self._join_processes(fn) 2025-12-04T13:34:05.5638581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5638847Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5639116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5639377Z raise RuntimeError(error) 2025-12-04T13:34:05.5639531Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5639694Z Traceback (most recent call last): 2025-12-04T13:34:05.5639935Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5640246Z getattr(self, test_name)() 2025-12-04T13:34:05.5640481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5640716Z fn() 2025-12-04T13:34:05.5640919Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5641158Z method(*args, **kwargs) 2025-12-04T13:34:05.5641382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5641613Z method(*args, **kwargs) 2025-12-04T13:34:05.5641867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5642100Z with policy(): 2025-12-04T13:34:05.5642313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5642551Z raise RuntimeError(msg) 2025-12-04T13:34:05.5642932Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5643276Z 2025-12-04T13:34:05.5643356Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5643661Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5643891Z 2025-12-04T13:34:05.5643982Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5644106Z 2025-12-04T13:34:05.5644111Z 2025-12-04T13:34:05.5644190Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5644401Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5644765Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-c070250cf4bdd18d.xml - 2025-12-04T13:34:05.5645101Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5645413Z FAILED [5.7125s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5645704Z Traceback (most recent call last): 2025-12-04T13:34:05.5645951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5646196Z getattr(self, test_name)() 2025-12-04T13:34:05.5646431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5646668Z fn() 2025-12-04T13:34:05.5646872Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5647106Z method(*args, **kwargs) 2025-12-04T13:34:05.5647327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5647558Z method(*args, **kwargs) 2025-12-04T13:34:05.5647777Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5648010Z with policy(): 2025-12-04T13:34:05.5648224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5648461Z raise RuntimeError(msg) 2025-12-04T13:34:05.5648844Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5649223Z 2025-12-04T13:34:05.5649300Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5649603Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5649833Z 2025-12-04T13:34:05.5649920Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5650109Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5650351Z ======================= 1 failed, 1 deselected in 5.87s ======================== 2025-12-04T13:34:05.5650496Z Got exit code 1 2025-12-04T13:34:05.5650598Z Retrying single test... 2025-12-04T13:34:05.5650858Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-d5ce45cbb6a7d075.xml 2025-12-04T13:34:05.5651150Z ============================= test session starts ============================== 2025-12-04T13:34:05.5651365Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5651560Z cachedir: .pytest_cache 2025-12-04T13:34:05.5651786Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5652026Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5652146Z configfile: pytest.ini 2025-12-04T13:34:05.5652374Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5652643Z collecting ... collected 3 items / 2 deselected / 1 selected 2025-12-04T13:34:05.5652935Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda 2025-12-04T13:34:05.5653198Z Running 1 items in this shard 2025-12-04T13:34:05.5653271Z 2025-12-04T13:34:05.5653545Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda I1204 13:33:23.600000 237147 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 237216 2025-12-04T13:34:05.5654003Z I1204 13:33:23.601000 237147 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 237217 2025-12-04T13:34:05.5654692Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5655282Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5655873Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5656457Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5656699Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5657048Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5657540Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5658057Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5658537Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5658994Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5659460Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5659927Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5660431Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5660896Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5661363Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5661817Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5662278Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5662745Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5663369Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5663956Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5664310Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5664862Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5665338Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5665706Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5666123Z [rank1]:E1204 13:33:27.945000 237217 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5666372Z dist init r=1, world=2 2025-12-04T13:34:05.5666579Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5666918Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5667448Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5667931Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5668442Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5668893Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5669334Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5669799Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5670304Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5670768Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5671232Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5671684Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5672146Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5672613Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5673240Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5673821Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5674172Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5674724Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5675192Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5675559Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5675977Z [rank0]:E1204 13:33:27.954000 237216 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5676221Z dist init r=0, world=2 2025-12-04T13:34:05.5676659Z [rank0]:[W1204 13:33:28.737846641 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5677071Z FAILED [5.9126s] [100%] 2025-12-04T13:34:05.5677134Z 2025-12-04T13:34:05.5677194Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5677377Z _________________ TestApplyCUDA.test_nested_module_apply_cuda __________________ 2025-12-04T13:34:05.5677547Z Traceback (most recent call last): 2025-12-04T13:34:05.5677826Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5678070Z self._join_processes(fn) 2025-12-04T13:34:05.5678314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5678579Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5678843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5679100Z raise RuntimeError(error) 2025-12-04T13:34:05.5679251Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5679408Z Traceback (most recent call last): 2025-12-04T13:34:05.5679643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5679880Z getattr(self, test_name)() 2025-12-04T13:34:05.5680110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5680410Z fn() 2025-12-04T13:34:05.5680608Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5680841Z method(*args, **kwargs) 2025-12-04T13:34:05.5681059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5681288Z method(*args, **kwargs) 2025-12-04T13:34:05.5681503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5681726Z with policy(): 2025-12-04T13:34:05.5681934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5682169Z raise RuntimeError(msg) 2025-12-04T13:34:05.5682547Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5682891Z 2025-12-04T13:34:05.5682964Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5683265Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5683493Z 2025-12-04T13:34:05.5683579Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5683706Z 2025-12-04T13:34:05.5683708Z 2025-12-04T13:34:05.5683788Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5683990Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5684352Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-d5ce45cbb6a7d075.xml - 2025-12-04T13:34:05.5684683Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5685028Z FAILED [5.9126s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5685316Z Traceback (most recent call last): 2025-12-04T13:34:05.5685555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5685796Z getattr(self, test_name)() 2025-12-04T13:34:05.5686025Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5686287Z fn() 2025-12-04T13:34:05.5686548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5691648Z method(*args, **kwargs) 2025-12-04T13:34:05.5691872Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5692106Z method(*args, **kwargs) 2025-12-04T13:34:05.5692322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5692546Z with policy(): 2025-12-04T13:34:05.5692758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5692988Z raise RuntimeError(msg) 2025-12-04T13:34:05.5693372Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5693733Z 2025-12-04T13:34:05.5693811Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5694116Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5694344Z 2025-12-04T13:34:05.5694433Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5694622Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5694787Z ======================= 1 failed, 2 deselected in 6.07s ======================== 2025-12-04T13:34:05.5694927Z Got exit code 1 2025-12-04T13:34:05.5695022Z Retrying single test... 2025-12-04T13:34:05.5695283Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-4ae4ccc9330b0c55.xml 2025-12-04T13:34:05.5695570Z ============================= test session starts ============================== 2025-12-04T13:34:05.5695782Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5695967Z cachedir: .pytest_cache 2025-12-04T13:34:05.5696190Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5696430Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5696546Z configfile: pytest.ini 2025-12-04T13:34:05.5696773Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5697041Z collecting ... collected 3 items / 2 deselected / 1 selected 2025-12-04T13:34:05.5697328Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda 2025-12-04T13:34:05.5697586Z Running 1 items in this shard 2025-12-04T13:34:05.5697660Z 2025-12-04T13:34:05.5697937Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda I1204 13:33:31.831000 237375 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 237444 2025-12-04T13:34:05.5698400Z I1204 13:33:31.831000 237375 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 237445 2025-12-04T13:34:05.5699145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5699729Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5700378Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5700963Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5701201Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5701543Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5702033Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5702514Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5702993Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5703444Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5703883Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5704347Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5704810Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5705271Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5705733Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5706180Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5706630Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5707091Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5707716Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5708333Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5708680Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5709246Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5709711Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5710075Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5710523Z [rank1]:E1204 13:33:36.093000 237445 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5710767Z dist init r=1, world=2 2025-12-04T13:34:05.5710967Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5711303Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5711785Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5712264Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5712740Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5713187Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5713627Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5714087Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5714548Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5715009Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5715467Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5715918Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5716371Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5716831Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5717488Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5718066Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5718440Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5718988Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5719452Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5719814Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5720253Z [rank0]:E1204 13:33:36.144000 237444 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5720492Z dist init r=0, world=2 2025-12-04T13:34:05.5720891Z [rank0]:[W1204 13:33:36.917784972 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5721299Z FAILED [5.7125s] [100%] 2025-12-04T13:34:05.5721361Z 2025-12-04T13:34:05.5721422Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5721601Z _________________ TestApplyCUDA.test_nested_module_apply_cuda __________________ 2025-12-04T13:34:05.5721764Z Traceback (most recent call last): 2025-12-04T13:34:05.5722006Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5722244Z self._join_processes(fn) 2025-12-04T13:34:05.5722486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5722746Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5723011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5723264Z raise RuntimeError(error) 2025-12-04T13:34:05.5723411Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5723577Z Traceback (most recent call last): 2025-12-04T13:34:05.5723819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5724056Z getattr(self, test_name)() 2025-12-04T13:34:05.5724284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5724512Z fn() 2025-12-04T13:34:05.5724709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5724940Z method(*args, **kwargs) 2025-12-04T13:34:05.5725162Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5725389Z method(*args, **kwargs) 2025-12-04T13:34:05.5725604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5725879Z with policy(): 2025-12-04T13:34:05.5726087Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5726315Z raise RuntimeError(msg) 2025-12-04T13:34:05.5726692Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5727035Z 2025-12-04T13:34:05.5727140Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5727441Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5727667Z 2025-12-04T13:34:05.5727756Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5727885Z 2025-12-04T13:34:05.5727887Z 2025-12-04T13:34:05.5727964Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5728163Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5728520Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-4ae4ccc9330b0c55.xml - 2025-12-04T13:34:05.5728849Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5729157Z FAILED [5.7125s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5729442Z Traceback (most recent call last): 2025-12-04T13:34:05.5729682Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5729928Z getattr(self, test_name)() 2025-12-04T13:34:05.5730159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5730476Z fn() 2025-12-04T13:34:05.5730673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5730898Z method(*args, **kwargs) 2025-12-04T13:34:05.5731114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5731338Z method(*args, **kwargs) 2025-12-04T13:34:05.5731556Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5731779Z with policy(): 2025-12-04T13:34:05.5731984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5732214Z raise RuntimeError(msg) 2025-12-04T13:34:05.5732589Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_nested_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5732933Z 2025-12-04T13:34:05.5733007Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5733304Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_nested_module_apply_cuda 2025-12-04T13:34:05.5733529Z 2025-12-04T13:34:05.5733618Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5733802Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5733964Z ======================= 1 failed, 2 deselected in 5.87s ======================== 2025-12-04T13:34:05.5734136Z Got exit code 1 2025-12-04T13:34:05.5734328Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda 2025-12-04T13:34:05.5734625Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:34:05.5734974Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-90474295012e1f76.xml 2025-12-04T13:34:05.5735255Z ============================= test session starts ============================== 2025-12-04T13:34:05.5735462Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5735680Z cachedir: .pytest_cache 2025-12-04T13:34:05.5735900Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5736133Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5736252Z configfile: pytest.ini 2025-12-04T13:34:05.5736474Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5736739Z collecting ... collected 3 items / 2 deselected / 1 selected 2025-12-04T13:34:05.5736894Z stepcurrent: skipping 2 already run items. 2025-12-04T13:34:05.5737020Z Running 1 items in this shard 2025-12-04T13:34:05.5737089Z 2025-12-04T13:34:05.5737366Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda I1204 13:33:40.078000 237603 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 237672 2025-12-04T13:34:05.5737829Z I1204 13:33:40.079000 237603 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 237673 2025-12-04T13:34:05.5738375Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5738817Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5739246Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5739676Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5740284Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5740872Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5741456Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5742035Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5742271Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5742612Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5743098Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5743612Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5744090Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5744218Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5744527Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5744677Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5744959Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5745106Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5745382Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5745519Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5745797Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5745946Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5746392Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5746508Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5746704Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5747031Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5747146Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5747358Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5747526Z [rank1]:E1204 13:33:44.499000 237673 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5747567Z dist init r=1, world=2 2025-12-04T13:34:05.5747704Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5747864Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5748173Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5748326Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5748637Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5748762Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5749036Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5749186Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5749462Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5749610Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5749885Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5750022Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5750348Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5750495Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5750940Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5751054Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5751250Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5751574Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5751685Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5751900Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5752066Z [rank0]:E1204 13:33:44.505000 237672 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5752138Z dist init r=0, world=2 2025-12-04T13:34:05.5752473Z [rank0]:[W1204 13:33:44.194734209 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5752513Z FAILED [5.9138s] [100%] 2025-12-04T13:34:05.5752515Z 2025-12-04T13:34:05.5752570Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5752659Z _______________ TestApplyCUDA.test_transformer_module_apply_cuda _______________ 2025-12-04T13:34:05.5752729Z Traceback (most recent call last): 2025-12-04T13:34:05.5752892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5752935Z self._join_processes(fn) 2025-12-04T13:34:05.5753108Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5753164Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5753341Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5753386Z raise RuntimeError(error) 2025-12-04T13:34:05.5753464Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5753509Z Traceback (most recent call last): 2025-12-04T13:34:05.5753671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5753714Z getattr(self, test_name)() 2025-12-04T13:34:05.5753871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5753907Z fn() 2025-12-04T13:34:05.5754060Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5754102Z method(*args, **kwargs) 2025-12-04T13:34:05.5754252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5754292Z method(*args, **kwargs) 2025-12-04T13:34:05.5754442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5754479Z with policy(): 2025-12-04T13:34:05.5754632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5754673Z raise RuntimeError(msg) 2025-12-04T13:34:05.5754991Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5754994Z 2025-12-04T13:34:05.5755071Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5755269Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5755272Z 2025-12-04T13:34:05.5755359Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5755361Z 2025-12-04T13:34:05.5755420Z Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5755465Z Traceback (most recent call last): 2025-12-04T13:34:05.5755628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5755671Z getattr(self, test_name)() 2025-12-04T13:34:05.5755829Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5755888Z fn() 2025-12-04T13:34:05.5756037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5756076Z method(*args, **kwargs) 2025-12-04T13:34:05.5756226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5756265Z method(*args, **kwargs) 2025-12-04T13:34:05.5756417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5756453Z with policy(): 2025-12-04T13:34:05.5756626Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5756666Z raise RuntimeError(msg) 2025-12-04T13:34:05.5756980Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5756984Z 2025-12-04T13:34:05.5757056Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5757250Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5757252Z 2025-12-04T13:34:05.5757338Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5757340Z 2025-12-04T13:34:05.5757342Z 2025-12-04T13:34:05.5757419Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5757504Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5757739Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-90474295012e1f76.xml - 2025-12-04T13:34:05.5757802Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5758019Z FAILED [5.9138s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5758064Z Traceback (most recent call last): 2025-12-04T13:34:05.5758226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5758269Z getattr(self, test_name)() 2025-12-04T13:34:05.5758427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5758462Z fn() 2025-12-04T13:34:05.5758612Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5758655Z method(*args, **kwargs) 2025-12-04T13:34:05.5758805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5758845Z method(*args, **kwargs) 2025-12-04T13:34:05.5758992Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5759029Z with policy(): 2025-12-04T13:34:05.5759179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5759220Z raise RuntimeError(msg) 2025-12-04T13:34:05.5759535Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5759537Z 2025-12-04T13:34:05.5759632Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5759827Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5759829Z 2025-12-04T13:34:05.5759915Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5759917Z 2025-12-04T13:34:05.5759974Z Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5760018Z Traceback (most recent call last): 2025-12-04T13:34:05.5760246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5760287Z getattr(self, test_name)() 2025-12-04T13:34:05.5760444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5760477Z fn() 2025-12-04T13:34:05.5760629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5760668Z method(*args, **kwargs) 2025-12-04T13:34:05.5760817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5760855Z method(*args, **kwargs) 2025-12-04T13:34:05.5761005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5761040Z with policy(): 2025-12-04T13:34:05.5761191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5761231Z raise RuntimeError(msg) 2025-12-04T13:34:05.5761547Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5761551Z 2025-12-04T13:34:05.5761623Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5761818Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5761820Z 2025-12-04T13:34:05.5761905Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5761968Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5762032Z ======================= 1 failed, 2 deselected in 6.06s ======================== 2025-12-04T13:34:05.5762069Z Got exit code 1 2025-12-04T13:34:05.5762108Z Retrying single test... 2025-12-04T13:34:05.5762302Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-eabaab15ed1d80df.xml 2025-12-04T13:34:05.5762360Z ============================= test session starts ============================== 2025-12-04T13:34:05.5762472Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5762512Z cachedir: .pytest_cache 2025-12-04T13:34:05.5762669Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5762714Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5762754Z configfile: pytest.ini 2025-12-04T13:34:05.5762916Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5762991Z collecting ... collected 3 items / 2 deselected / 1 selected 2025-12-04T13:34:05.5763182Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda 2025-12-04T13:34:05.5763262Z Running 1 items in this shard 2025-12-04T13:34:05.5763264Z 2025-12-04T13:34:05.5763541Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda I1204 13:33:48.564000 237831 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 237900 2025-12-04T13:34:05.5763695Z I1204 13:33:48.565000 237831 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 237901 2025-12-04T13:34:05.5764081Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5764130Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5764479Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5764529Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5765018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5765079Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5765568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5765628Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5765770Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5765932Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5766222Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5766376Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5766663Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5766789Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5767065Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5767212Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5767488Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5767636Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5767934Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5768069Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5768366Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5768516Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5768959Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5769076Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5769270Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5769595Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5769707Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5769922Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5770087Z [rank0]:E1204 13:33:52.972000 237900 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5770127Z dist init r=0, world=2 2025-12-04T13:34:05.5770299Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5770459Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5770747Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5770901Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5771186Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5771309Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5771587Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5771734Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5772009Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5772182Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5772458Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5772622Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5772900Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5773048Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5773487Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5773602Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5773798Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5774122Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5774236Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5774447Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5774614Z [rank1]:E1204 13:33:52.973000 237901 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5774651Z dist init r=1, world=2 2025-12-04T13:34:05.5774990Z [rank0]:[W1204 13:33:53.661322330 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5775030Z FAILED [5.9129s] [100%] 2025-12-04T13:34:05.5775032Z 2025-12-04T13:34:05.5775087Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5775175Z _______________ TestApplyCUDA.test_transformer_module_apply_cuda _______________ 2025-12-04T13:34:05.5775221Z Traceback (most recent call last): 2025-12-04T13:34:05.5775382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5775426Z self._join_processes(fn) 2025-12-04T13:34:05.5775600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5775655Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5775833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5775875Z raise RuntimeError(error) 2025-12-04T13:34:05.5775976Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5776019Z Traceback (most recent call last): 2025-12-04T13:34:05.5776179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5776220Z getattr(self, test_name)() 2025-12-04T13:34:05.5776378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5776413Z fn() 2025-12-04T13:34:05.5776583Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5776623Z method(*args, **kwargs) 2025-12-04T13:34:05.5776775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5776814Z method(*args, **kwargs) 2025-12-04T13:34:05.5776966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5777002Z with policy(): 2025-12-04T13:34:05.5777153Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5777193Z raise RuntimeError(msg) 2025-12-04T13:34:05.5777511Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5777514Z 2025-12-04T13:34:05.5777588Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5777785Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5777789Z 2025-12-04T13:34:05.5777875Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5777877Z 2025-12-04T13:34:05.5777936Z Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5777980Z Traceback (most recent call last): 2025-12-04T13:34:05.5778141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5778183Z getattr(self, test_name)() 2025-12-04T13:34:05.5778342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5778375Z fn() 2025-12-04T13:34:05.5778529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5778569Z method(*args, **kwargs) 2025-12-04T13:34:05.5778719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5778762Z method(*args, **kwargs) 2025-12-04T13:34:05.5778910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5778948Z with policy(): 2025-12-04T13:34:05.5779099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5779140Z raise RuntimeError(msg) 2025-12-04T13:34:05.5779458Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5779460Z 2025-12-04T13:34:05.5779533Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5779727Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5779756Z 2025-12-04T13:34:05.5779844Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5779846Z 2025-12-04T13:34:05.5779848Z 2025-12-04T13:34:05.5779922Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5780010Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5780285Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-eabaab15ed1d80df.xml - 2025-12-04T13:34:05.5780377Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5780597Z FAILED [5.9129s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:34:05.5780645Z Traceback (most recent call last): 2025-12-04T13:34:05.5780809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5780849Z getattr(self, test_name)() 2025-12-04T13:34:05.5781009Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5781043Z fn() 2025-12-04T13:34:05.5781194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5781234Z method(*args, **kwargs) 2025-12-04T13:34:05.5781386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5781424Z method(*args, **kwargs) 2025-12-04T13:34:05.5781576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5781615Z with policy(): 2025-12-04T13:34:05.5781767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5781807Z raise RuntimeError(msg) 2025-12-04T13:34:05.5782124Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5782125Z 2025-12-04T13:34:05.5782198Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5782394Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5782396Z 2025-12-04T13:34:05.5782481Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5782484Z 2025-12-04T13:34:05.5782542Z Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5782586Z Traceback (most recent call last): 2025-12-04T13:34:05.5782748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5782789Z getattr(self, test_name)() 2025-12-04T13:34:05.5782945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5782979Z fn() 2025-12-04T13:34:05.5783129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5783169Z method(*args, **kwargs) 2025-12-04T13:34:05.5783317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5783356Z method(*args, **kwargs) 2025-12-04T13:34:05.5783533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5783570Z with policy(): 2025-12-04T13:34:05.5783720Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5783761Z raise RuntimeError(msg) 2025-12-04T13:34:05.5784075Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5784077Z 2025-12-04T13:34:05.5784171Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5784363Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5784367Z 2025-12-04T13:34:05.5784452Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5784514Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5784577Z ======================= 1 failed, 2 deselected in 6.07s ======================== 2025-12-04T13:34:05.5784614Z Got exit code 1 2025-12-04T13:34:05.5784655Z Retrying single test... 2025-12-04T13:34:05.5784848Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-ffbcf4dc98ebf483.xml 2025-12-04T13:34:05.5784906Z ============================= test session starts ============================== 2025-12-04T13:34:05.5785019Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5785064Z cachedir: .pytest_cache 2025-12-04T13:34:05.5785222Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5785271Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5785312Z configfile: pytest.ini 2025-12-04T13:34:05.5785476Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5785547Z collecting ... collected 3 items / 2 deselected / 1 selected 2025-12-04T13:34:05.5785738Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda 2025-12-04T13:34:05.5785782Z Running 1 items in this shard 2025-12-04T13:34:05.5785784Z 2025-12-04T13:34:05.5786065Z distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda I1204 13:33:56.817000 238059 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 238128 2025-12-04T13:34:05.5786221Z I1204 13:33:56.818000 238059 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 238129 2025-12-04T13:34:05.5786579Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5786630Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5787120Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5787181Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5787532Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:34:05.5787599Z self.encoder = TransformerEncoder( 2025-12-04T13:34:05.5788090Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:34:05.5788171Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:34:05.5788316Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5788479Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5788772Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5788928Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5789219Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5789346Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5789625Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5789777Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5790054Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5790236Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5790519Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5790657Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5790936Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5791083Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5791529Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5791643Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5791867Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5792191Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5792304Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5792544Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5792710Z [rank1]:E1204 13:34:01.254000 238129 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:34:05.5792750Z dist init r=1, world=2 2025-12-04T13:34:05.5792889Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:34:05.5793047Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:34:05.5793332Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5793488Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:34:05.5793770Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5793896Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:34:05.5794171Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5794318Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5794594Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5794740Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:34:05.5795019Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5795153Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:34:05.5795430Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5795578Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:34:05.5796019Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2017460224 and is now 2323644416. 2025-12-04T13:34:05.5796162Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5796358Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5796700Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5796813Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:34:05.5797025Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5797191Z [rank0]:E1204 13:34:01.296000 238128 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:34:05.5797230Z dist init r=0, world=2 2025-12-04T13:34:05.5797566Z [rank0]:[W1204 13:34:01.099062179 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:34:05.5797607Z FAILED [5.9131s] [100%] 2025-12-04T13:34:05.5797608Z 2025-12-04T13:34:05.5797664Z =================================== FAILURES =================================== 2025-12-04T13:34:05.5797752Z _______________ TestApplyCUDA.test_transformer_module_apply_cuda _______________ 2025-12-04T13:34:05.5797798Z Traceback (most recent call last): 2025-12-04T13:34:05.5797960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:34:05.5798004Z self._join_processes(fn) 2025-12-04T13:34:05.5798178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:34:05.5798230Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:34:05.5798409Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:34:05.5798452Z raise RuntimeError(error) 2025-12-04T13:34:05.5798535Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5798580Z Traceback (most recent call last): 2025-12-04T13:34:05.5798742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5798785Z getattr(self, test_name)() 2025-12-04T13:34:05.5798943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5798978Z fn() 2025-12-04T13:34:05.5799129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5799170Z method(*args, **kwargs) 2025-12-04T13:34:05.5799322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5799362Z method(*args, **kwargs) 2025-12-04T13:34:05.5799512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5799550Z with policy(): 2025-12-04T13:34:05.5799700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5799742Z raise RuntimeError(msg) 2025-12-04T13:34:05.5800083Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5800085Z 2025-12-04T13:34:05.5800159Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5800395Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5800397Z 2025-12-04T13:34:05.5800514Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5800516Z 2025-12-04T13:34:05.5800518Z 2025-12-04T13:34:05.5800593Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:34:05.5800681Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:34:05.5800921Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-ffbcf4dc98ebf483.xml - 2025-12-04T13:34:05.5800980Z =========================== short test summary info ============================ 2025-12-04T13:34:05.5801200Z FAILED [5.9131s] distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:34:05.5801244Z Traceback (most recent call last): 2025-12-04T13:34:05.5801408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:34:05.5801450Z getattr(self, test_name)() 2025-12-04T13:34:05.5801610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:34:05.5801645Z fn() 2025-12-04T13:34:05.5801797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5801838Z method(*args, **kwargs) 2025-12-04T13:34:05.5801991Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:34:05.5802030Z method(*args, **kwargs) 2025-12-04T13:34:05.5802180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:34:05.5802216Z with policy(): 2025-12-04T13:34:05.5802369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:34:05.5802409Z raise RuntimeError(msg) 2025-12-04T13:34:05.5802726Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestApplyCUDA.test_transformer_module_apply_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 1864368128 and is now 2170552320. 2025-12-04T13:34:05.5802730Z 2025-12-04T13:34:05.5802802Z To execute this test, run the following from the base repo dir: 2025-12-04T13:34:05.5802998Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_apply.py TestApplyCUDA.test_transformer_module_apply_cuda 2025-12-04T13:34:05.5803000Z 2025-12-04T13:34:05.5803085Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:34:05.5803150Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:34:05.5803213Z ======================= 1 failed, 2 deselected in 6.07s ======================== 2025-12-04T13:34:05.5803249Z Got exit code 1 2025-12-04T13:34:05.5803400Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda 2025-12-04T13:34:05.5803525Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:34:05.5803745Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-4d5627d079bf00b5.xml 2025-12-04T13:34:05.5803803Z ============================= test session starts ============================== 2025-12-04T13:34:05.5803918Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:34:05.5803960Z cachedir: .pytest_cache 2025-12-04T13:34:05.5804119Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:34:05.5804164Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:34:05.5804227Z configfile: pytest.ini 2025-12-04T13:34:05.5804387Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:34:05.5804460Z collecting ... collected 3 items / 3 deselected / 0 selected 2025-12-04T13:34:05.5804514Z stepcurrent: skipping 3 already run items. 2025-12-04T13:34:05.5804557Z Running 0 items in this shard 2025-12-04T13:34:05.5804559Z 2025-12-04T13:34:05.5804793Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_apply/distributed.fsdp.test_fsdp_apply-4d5627d079bf00b5.xml - 2025-12-04T13:34:05.5804852Z ============================ 3 deselected in 0.00s ============================= 2025-12-04T13:34:05.5805283Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_apply_in_summon_raises_error_cuda', 'test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_nested_module_apply_cuda', 'test/distributed/fsdp/test_fsdp_apply.py::TestApplyCUDA::test_transformer_module_apply_cuda'] 2025-12-04T13:34:05.5805286Z 2025-12-04T13:34:05.5805472Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_apply 1/1 (test/test-reports/distributed.fsdp.test_fsdp_apply_1.1_f5676752440bc7db_.log) 2025-12-04T13:34:05.5805476Z 2025-12-04T13:34:05.5805600Z Finished distributed/fsdp/test_fsdp_apply 1/1 ... [2025-12-04 13:34:05.540975][2240356.030950816], took 1.24min 2025-12-04T13:34:05.5805859Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:34:05.5805945Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:34:05.5806039Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:34:05.5806088Z Uploading artifacts took 0.00 seconds 2025-12-04T13:34:05.5806142Z distributed/fsdp/test_fsdp_apply 1/1 failed! 2025-12-04T13:34:05.5806280Z Running distributed/_composable/fsdp/test_fully_shard_frozen 1/1 ... [2025-12-04 13:34:05.544521][2240356.034502255] 2025-12-04T13:34:05.5806327Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:34:05.5806671Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_frozen.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:34:05.544719] 2025-12-04T13:34:48.6825694Z 2025-12-04T13:34:48.6826558Z distributed/_composable/fsdp/test_fully_shard_frozen 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_frozen_1.1_aa874de385d7648b_.log 2025-12-04T13:34:48.6828102Z Running 3 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_frozen.py::TestFullyShardFrozen::test_multi_forward_mixed_requires_grad, test/distributed/_composable/fsdp/test_fully_shard_frozen.py::TestFullyShardFrozen::test_train_mixed_requires_grad_across_groups, test/distributed/_composable/fsdp/test_fully_shard_frozen.py::TestFullyShardFrozen::test_train_mixed_requires_grad_per_group 2025-12-04T13:34:48.6829725Z 2025-12-04T13:34:48.6829982Z Finished distributed/_composable/fsdp/test_fully_shard_frozen 1/1 ... [2025-12-04 13:34:48.682255][2240399.172230252], took 0.72min 2025-12-04T13:34:48.6843373Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:34:48.6860294Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:34:48.6862894Z Running distributed/checkpoint/test_hsdp_checkpoint 1/1 ... [2025-12-04 13:34:48.686164][2240399.176145166] 2025-12-04T13:34:48.6863200Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:34:48.6865035Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_hsdp_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:34:48.686376] 2025-12-04T13:35:21.3073419Z 2025-12-04T13:35:21.3074141Z distributed/checkpoint/test_hsdp_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_hsdp_checkpoint_1.1_5a8ed1b520e6291e_.log 2025-12-04T13:35:21.3076600Z Running 4 items in this shard: test/distributed/checkpoint/test_hsdp_checkpoint.py::TestHSDPCheckpoint::test_hsdp_checkpoint_is_even_sharded_model_False, test/distributed/checkpoint/test_hsdp_checkpoint.py::TestHSDPCheckpoint::test_hsdp_checkpoint_is_even_sharded_model_True, test/distributed/checkpoint/test_hsdp_checkpoint.py::TestHSDPCheckpoint::test_hsdp_fsdp_checkpoint_conversion_is_even_sharded_model_False, test/distributed/checkpoint/test_hsdp_checkpoint.py::TestHSDPCheckpoint::test_hsdp_fsdp_checkpoint_conversion_is_even_sharded_model_True 2025-12-04T13:35:21.3078411Z 2025-12-04T13:35:21.3078758Z Finished distributed/checkpoint/test_hsdp_checkpoint 1/1 ... [2025-12-04 13:35:21.307186][2240431.797162399], took 0.54min 2025-12-04T13:35:21.3092098Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:35:21.3107948Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:35:21.3110306Z Running distributed/tensor/parallel/test_parallelize_api 1/1 ... [2025-12-04 13:35:21.310901][2240431.800882396] 2025-12-04T13:35:21.3110712Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:35:21.3112706Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/parallel/test_parallelize_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:35:21.311129] 2025-12-04T13:37:25.1809575Z 2025-12-04T13:37:25.1810543Z distributed/tensor/parallel/test_parallelize_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_parallelize_api_1.1_95d73ce6ba81cd9a_.log 2025-12-04T13:37:25.1821042Z Running 32 items in this shard: test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_empty_plan, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_linear_col_wise_parallel, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_linear_row_wise_parallel, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_mlp_with_module_api, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_mlp_with_module_api_nested, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_multi_wildcard, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_src_data_rank, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_digit, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_no_match, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_question, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_root_module, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_star, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_prepare_module_input, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_prepare_module_input_output, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_prepare_module_output, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_under_devicemesh_context, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_empty_plan, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_linear_col_wise_parallel, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_linear_row_wise_parallel, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_mlp_with_module_api, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_mlp_with_module_api_nested, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_module_multi_wildcard, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_module_src_data_rank, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_module_with_digit, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_module_with_no_match, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_module_with_question, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_module_with_root_module, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_parallelize_module_with_star, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_prepare_module_input, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_prepare_module_input_output, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_prepare_module_output, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITestsWithLocalTensor::test_under_devicemesh_context 2025-12-04T13:37:25.1829042Z 2025-12-04T13:37:25.1829225Z Finished distributed/tensor/parallel/test_parallelize_api 1/1 ... [2025-12-04 13:37:25.180629][2240555.670605784], took 2.06min 2025-12-04T13:37:25.1829749Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:37:25.1840019Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:37:25.1843509Z Running distributed/fsdp/test_fsdp_state_dict 1/2 ... [2025-12-04 13:37:25.184135][2240555.674115874] 2025-12-04T13:37:25.1844012Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:37:25.1844890Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_state_dict.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:37:25.184333] 2025-12-04T13:44:01.5298187Z 2025-12-04T13:44:01.5298542Z distributed/fsdp/test_fsdp_state_dict 1/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_state_dict_1.2_30ee8a653eb56a56_.log 2025-12-04T13:44:01.5319733Z Running 78 items in this shard: test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_keys_state_dict_type_local_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_keys_state_dict_type_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_full_state_dict_missing_unexpected_keys_cleaned, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_rank0_offload_save_load_flow_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_save_load_flow_state_dict_type_sharded_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_type, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_False_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_True_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_False_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_False_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_True_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_False_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_True_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_True_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_shared_parameters_state_dict_type_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_torch_save_load, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict4GPUs::test_local_state_dict_reshard 2025-12-04T13:44:01.5339418Z 2025-12-04T13:44:01.5339555Z Finished distributed/fsdp/test_fsdp_state_dict 1/2 ... [2025-12-04 13:44:01.530272][2240952.020248419], took 6.61min 2025-12-04T13:44:01.5339993Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:44:01.5340435Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:44:01.5340690Z Running distributed/_composable/fsdp/test_fully_shard_init 1/1 ... [2025-12-04 13:44:01.533547][2240952.023528533] 2025-12-04T13:44:01.5340906Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:44:01.5341403Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:44:01.533748] 2025-12-04T13:44:15.0699492Z 2025-12-04T13:44:15.0701044Z distributed/_composable/fsdp/test_fully_shard_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_init_1.1_d970b1e8e9e5cb1c_.log 2025-12-04T13:44:15.0718097Z Running 42 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceTensor::test_move_states_to_device_ignored_param_device, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceTensor::test_move_states_to_device_tensor, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_invalid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_valid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_2d_mesh_without_mesh_dim_names, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_invalid_mesh_ndim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_duplicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested_fully_shard_and_replicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_single, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_nested_fully_shard, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_shared_params_and_buffers, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_duplicates, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_shared_params, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_noncontiguous_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_scalar_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_shard_tensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterDTensor::test_shard_dtensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_double_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_is_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_module_and_param_fqns, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_multi_module_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_reset_sharded_param_in_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_invalid_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_1d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_2d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_rank0_broadcast_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_1d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_2d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardHSDPBroadcast::test_hsdp_broadcast_across_replicas, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestHSDPWithCustomHook::test_custom_hook_custom_stream, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestHSDPWithCustomHook::test_custom_hsdp_all_reduce_hook, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_dim_neg1, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_uneven_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_2d_transformer_shard_diff_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_invalid_shard_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardOldImport::test_old_import_training, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMixedDtypeParam::test_mixed_dtypes_no_grad_param 2025-12-04T13:44:15.0727689Z 2025-12-04T13:44:15.0727881Z Finished distributed/_composable/fsdp/test_fully_shard_init 1/1 ... [2025-12-04 13:44:15.069720][2240965.559695783], took 0.23min 2025-12-04T13:44:15.0728444Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:44:15.0734457Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:44:15.0736287Z Running distributed/fsdp/test_fsdp_flatten_params 1/1 ... [2025-12-04 13:44:15.073555][2240965.563536039] 2025-12-04T13:44:15.0736507Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:44:15.0738707Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_flatten_params.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:44:15.073768] 2025-12-04T13:45:18.6442412Z 2025-12-04T13:45:18.6443562Z distributed/fsdp/test_fsdp_flatten_params 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_flatten_params_1.1_a6f022fffc08237d_.log 2025-12-04T13:45:18.6449313Z Running 14 items in this shard: test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_empty_module, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_aligned_full_precision, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_aligned_mixed_precision, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_unaligned, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_with_memory_format_memory_format0, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_with_memory_format_memory_format1, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flatten_nothing, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_numel_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_numel_without_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_output_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_output_without_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_partial_flattening, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_pnorm_after_step_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_writeback_orig_params_no_shard 2025-12-04T13:45:18.6454819Z 2025-12-04T13:45:18.6455126Z Finished distributed/fsdp/test_fsdp_flatten_params 1/1 ... [2025-12-04 13:45:18.644011][2241029.133985402], took 1.06min 2025-12-04T13:45:18.6462800Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:45:18.6479501Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:45:18.6481789Z Running distributed/test_distributed_spawn 3/7 ... [2025-12-04 13:45:18.648047][2241029.138028575] 2025-12-04T13:45:18.6482792Z MPI not available -- MPI backend tests will be skipped 2025-12-04T13:45:18.6483907Z Running distributed tests for the test backend with env init_method 2025-12-04T13:45:18.6485010Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:45:18.6486751Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:45:18.648565] 2025-12-04T13:45:20.5237694Z 2025-12-04T13:45:20.5238525Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_988b5fad1c888ee9_.log 2025-12-04T13:45:20.5239214Z Running 0 items in this shard: 2025-12-04T13:45:20.5239362Z 2025-12-04T13:45:20.5244883Z Running distributed tests for the test backend with file init_method 2025-12-04T13:45:20.5246122Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:45:20.5248726Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:45:20.524729] 2025-12-04T13:45:22.4065329Z 2025-12-04T13:45:22.4066159Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_893b7217bf9d203c_.log 2025-12-04T13:45:22.4066751Z Running 0 items in this shard: 2025-12-04T13:45:22.4067091Z 2025-12-04T13:45:22.4072292Z Running distributed tests for the nccl backend with env init_method 2025-12-04T13:45:22.4082566Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:45:22.4083298Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:45:22.407494] 2025-12-04T13:48:29.8214292Z 2025-12-04T13:48:29.8215344Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_80f7553986dff483_.log 2025-12-04T13:48:29.8229096Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:48:29.8238221Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T13:48:29.8238758Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T13:48:29.8239229Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T13:48:29.8239765Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T13:48:29.8240290Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T13:48:29.8240727Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T13:48:29.8241185Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T13:48:29.8241724Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T13:48:29.8242226Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T13:48:29.8242721Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T13:48:29.8243177Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T13:48:29.8243611Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T13:48:29.8244046Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T13:48:29.8244558Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T13:48:29.8245097Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T13:48:29.8245555Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T13:48:29.8245988Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T13:48:29.8246424Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T13:48:29.8246878Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T13:48:29.8247353Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T13:48:29.8247803Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T13:48:29.8248168Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T13:48:29.8248499Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T13:48:29.8248865Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T13:48:29.8249238Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T13:48:29.8249603Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T13:48:29.8249953Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T13:48:29.8250363Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T13:48:29.8250779Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T13:48:29.8251142Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T13:48:29.8251490Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T13:48:29.8251819Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T13:48:29.8252187Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T13:48:29.8252552Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T13:48:29.8252900Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T13:48:29.8253257Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:48:29.8253468Z 2025-12-04T13:48:29.8253557Z Running distributed tests for the nccl backend with file init_method 2025-12-04T13:48:29.8253728Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:48:29.8254155Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:48:29.822623] 2025-12-04T13:51:37.9601746Z 2025-12-04T13:51:37.9602668Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_04076e5febbcd39d_.log 2025-12-04T13:51:37.9614056Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:51:37.9622938Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T13:51:37.9623490Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T13:51:37.9623985Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T13:51:37.9624468Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T13:51:37.9624961Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T13:51:37.9625409Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T13:51:37.9625881Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T13:51:37.9626392Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T13:51:37.9626912Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T13:51:37.9627420Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T13:51:37.9627888Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T13:51:37.9628334Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T13:51:37.9628790Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T13:51:37.9629313Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T13:51:37.9629920Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T13:51:37.9630333Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T13:51:37.9630694Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T13:51:37.9631051Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T13:51:37.9631463Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T13:51:37.9631860Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T13:51:37.9632240Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T13:51:37.9632582Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T13:51:37.9632914Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T13:51:37.9633291Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T13:51:37.9633672Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T13:51:37.9634042Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T13:51:37.9634395Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T13:51:37.9634762Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T13:51:37.9635152Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T13:51:37.9635524Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T13:51:37.9635877Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T13:51:37.9636220Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T13:51:37.9636567Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T13:51:37.9636940Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T13:51:37.9637296Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T13:51:37.9637663Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:51:37.9637880Z 2025-12-04T13:51:37.9637967Z Running distributed tests for the gloo backend with env init_method 2025-12-04T13:51:37.9638138Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:51:37.9638576Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:51:37.961387] 2025-12-04T13:54:16.2179306Z 2025-12-04T13:54:16.2180454Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_3dea77d4b30894d7_.log 2025-12-04T13:54:16.2192382Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:54:16.2200332Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T13:54:16.2201107Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T13:54:16.2201621Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T13:54:16.2202125Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T13:54:16.2202696Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T13:54:16.2203188Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T13:54:16.2203610Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T13:54:16.2204036Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T13:54:16.2204465Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T13:54:16.2204884Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T13:54:16.2205276Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T13:54:16.2205646Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T13:54:16.2206024Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T13:54:16.2206460Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T13:54:16.2206920Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T13:54:16.2207311Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T13:54:16.2207687Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T13:54:16.2208053Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T13:54:16.2208438Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T13:54:16.2208847Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T13:54:16.2209229Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T13:54:16.2209581Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T13:54:16.2209924Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T13:54:16.2210357Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T13:54:16.2210751Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T13:54:16.2211181Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T13:54:16.2211547Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T13:54:16.2211930Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T13:54:16.2212334Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T13:54:16.2212750Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T13:54:16.2213117Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T13:54:16.2213450Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T13:54:16.2213795Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T13:54:16.2214159Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T13:54:16.2214508Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T13:54:16.2214867Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:54:16.2215083Z 2025-12-04T13:54:16.2215169Z Running distributed tests for the gloo backend with file init_method 2025-12-04T13:54:16.2215337Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:54:16.2215765Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:54:16.218929] 2025-12-04T13:56:54.7847187Z 2025-12-04T13:56:54.7847818Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_c3f35fc8076e8022_.log 2025-12-04T13:56:54.7853932Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:56:54.7859579Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T13:56:54.7860012Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T13:56:54.7860451Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T13:56:54.7860836Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T13:56:54.7861219Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T13:56:54.7861571Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T13:56:54.7861942Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T13:56:54.7862338Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T13:56:54.7862744Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T13:56:54.7863140Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T13:56:54.7863508Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T13:56:54.7863900Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T13:56:54.7864254Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T13:56:54.7864659Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T13:56:54.7865094Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T13:56:54.7865504Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T13:56:54.7865861Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T13:56:54.7866210Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T13:56:54.7866572Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T13:56:54.7866958Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T13:56:54.7867327Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T13:56:54.7867665Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T13:56:54.7867992Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T13:56:54.7868354Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T13:56:54.7868730Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T13:56:54.7869094Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T13:56:54.7869440Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T13:56:54.7869800Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T13:56:54.7870220Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T13:56:54.7870585Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T13:56:54.7870935Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T13:56:54.7871268Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T13:56:54.7871612Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T13:56:54.7871978Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T13:56:54.7872328Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T13:56:54.7872685Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T13:56:54.7872898Z 2025-12-04T13:56:54.7873078Z Finished distributed/test_distributed_spawn 3/7 ... [2025-12-04 13:56:54.785175][2241725.275153103], took 11.60min 2025-12-04T13:56:54.7873506Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T13:56:54.7883567Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:56:54.7883790Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:56:54.7883967Z Uploading artifacts took 0.00 seconds 2025-12-04T13:56:54.7886117Z Running distributed/test_distributed_spawn 6/7 ... [2025-12-04 13:56:54.788535][2241725.278516326] 2025-12-04T13:56:54.7886919Z MPI not available -- MPI backend tests will be skipped 2025-12-04T13:56:54.7887907Z Running distributed tests for the test backend with env init_method 2025-12-04T13:56:54.7888901Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:56:54.7890630Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:56:54.788960] 2025-12-04T13:56:56.7460300Z 2025-12-04T13:56:56.7461217Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_0aeaf24c50f81dc7_.log 2025-12-04T13:56:56.7461910Z Running 0 items in this shard: 2025-12-04T13:56:56.7462077Z 2025-12-04T13:56:56.7463663Z Running distributed tests for the test backend with file init_method 2025-12-04T13:56:56.7464086Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:56:56.7466581Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:56:56.746506] 2025-12-04T13:56:58.6039089Z 2025-12-04T13:56:58.6039977Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_e589a5a4ad2dbac8_.log 2025-12-04T13:56:58.6040813Z Running 0 items in this shard: 2025-12-04T13:56:58.6040977Z 2025-12-04T13:56:58.6043948Z Running distributed tests for the nccl backend with env init_method 2025-12-04T13:56:58.6044545Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:56:58.6047612Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:56:58.604608] 2025-12-04T14:01:03.6573193Z 2025-12-04T14:01:03.6577172Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_f1f15660ae1d40da_.log 2025-12-04T14:01:03.6590556Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:01:03.6599889Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T14:01:03.6600454Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T14:01:03.6600949Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T14:01:03.6601369Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T14:01:03.6601792Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T14:01:03.6602226Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T14:01:03.6602700Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T14:01:03.6603121Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T14:01:03.6603558Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T14:01:03.6603993Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T14:01:03.6604451Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T14:01:03.6604911Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T14:01:03.6605380Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T14:01:03.6605860Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T14:01:03.6606294Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T14:01:03.6606682Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T14:01:03.6607049Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T14:01:03.6607401Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T14:01:03.6607739Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T14:01:03.6608089Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T14:01:03.6608471Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T14:01:03.6608904Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T14:01:03.6609324Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T14:01:03.6609708Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T14:01:03.6610073Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T14:01:03.6610462Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T14:01:03.6610864Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T14:01:03.6611338Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T14:01:03.6611783Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T14:01:03.6612195Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T14:01:03.6612571Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T14:01:03.6612917Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T14:01:03.6613262Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T14:01:03.6613598Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T14:01:03.6613924Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T14:01:03.6614245Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T14:01:03.6614583Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T14:01:03.6614946Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T14:01:03.6615356Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T14:01:03.6615781Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T14:01:03.6616145Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T14:01:03.6616479Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T14:01:03.6616828Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:01:03.6617030Z 2025-12-04T14:01:03.6617131Z Running distributed tests for the nccl backend with file init_method 2025-12-04T14:01:03.6617298Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:01:03.6617725Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:01:03.658809] 2025-12-04T14:05:08.7501203Z 2025-12-04T14:05:08.7501844Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_60990c1b215f9b56_.log 2025-12-04T14:05:08.7509143Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:05:08.7516119Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T14:05:08.7516607Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T14:05:08.7516981Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T14:05:08.7517342Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T14:05:08.7517739Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T14:05:08.7518109Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T14:05:08.7518483Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T14:05:08.7518846Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T14:05:08.7519211Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T14:05:08.7519585Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T14:05:08.7519980Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T14:05:08.7520415Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T14:05:08.7520810Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T14:05:08.7521219Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T14:05:08.7521589Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T14:05:08.7521942Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T14:05:08.7522312Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T14:05:08.7522668Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T14:05:08.7522999Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T14:05:08.7523352Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T14:05:08.7523734Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T14:05:08.7524151Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T14:05:08.7524575Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T14:05:08.7524963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T14:05:08.7525333Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T14:05:08.7525723Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T14:05:08.7526124Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T14:05:08.7526563Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T14:05:08.7527001Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T14:05:08.7527442Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T14:05:08.7527788Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T14:05:08.7528134Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T14:05:08.7528480Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T14:05:08.7528815Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T14:05:08.7529143Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T14:05:08.7529469Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T14:05:08.7529809Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T14:05:08.7530244Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T14:05:08.7530659Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T14:05:08.7531083Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T14:05:08.7531446Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T14:05:08.7531780Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T14:05:08.7532131Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:05:08.7532332Z 2025-12-04T14:05:08.7532419Z Running distributed tests for the gloo backend with env init_method 2025-12-04T14:05:08.7532591Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:05:08.7533018Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:05:08.751502] 2025-12-04T14:08:43.9486959Z 2025-12-04T14:08:43.9488263Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_de287bf016b2819d_.log 2025-12-04T14:08:43.9503682Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:08:43.9513486Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T14:08:43.9514027Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T14:08:43.9514477Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T14:08:43.9514966Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T14:08:43.9515407Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T14:08:43.9515851Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T14:08:43.9516296Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T14:08:43.9516728Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T14:08:43.9517166Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T14:08:43.9517615Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T14:08:43.9518079Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T14:08:43.9518554Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T14:08:43.9519033Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T14:08:43.9519515Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T14:08:43.9519957Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T14:08:43.9520421Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T14:08:43.9520863Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T14:08:43.9521249Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T14:08:43.9521593Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T14:08:43.9521953Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T14:08:43.9522350Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T14:08:43.9522778Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T14:08:43.9523206Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T14:08:43.9523598Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T14:08:43.9524022Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T14:08:43.9524385Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T14:08:43.9524795Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T14:08:43.9525271Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T14:08:43.9525723Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T14:08:43.9526144Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T14:08:43.9526506Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T14:08:43.9526857Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T14:08:43.9527214Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T14:08:43.9527556Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T14:08:43.9527889Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T14:08:43.9528219Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T14:08:43.9528568Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T14:08:43.9528936Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T14:08:43.9529352Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T14:08:43.9529780Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T14:08:43.9530154Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T14:08:43.9530552Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T14:08:43.9530909Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:08:43.9531115Z 2025-12-04T14:08:43.9531209Z Running distributed tests for the gloo backend with file init_method 2025-12-04T14:08:43.9531392Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:08:43.9531830Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:08:43.950060] 2025-12-04T14:12:19.3965838Z 2025-12-04T14:12:19.3966650Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_2d8c29974ac5b68c_.log 2025-12-04T14:12:19.3976507Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:12:19.3984662Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T14:12:19.3985183Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T14:12:19.3985580Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T14:12:19.3985958Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T14:12:19.3986350Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T14:12:19.3986734Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T14:12:19.3987128Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T14:12:19.3987507Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T14:12:19.3987895Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T14:12:19.3988285Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T14:12:19.3988699Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T14:12:19.3989113Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T14:12:19.3989528Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T14:12:19.3989952Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T14:12:19.3990391Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T14:12:19.3990741Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T14:12:19.3991108Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T14:12:19.3991461Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T14:12:19.3991811Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T14:12:19.3992164Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T14:12:19.3992554Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T14:12:19.3992977Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T14:12:19.3993438Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T14:12:19.3993819Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T14:12:19.3994182Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T14:12:19.3994536Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T14:12:19.3994963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T14:12:19.3995404Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T14:12:19.3995848Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T14:12:19.3996263Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T14:12:19.3996608Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T14:12:19.3996950Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T14:12:19.3997297Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T14:12:19.3997629Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T14:12:19.3997951Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T14:12:19.3998276Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T14:12:19.3998636Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T14:12:19.3998995Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T14:12:19.3999409Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T14:12:19.3999834Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T14:12:19.4000237Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T14:12:19.4000575Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T14:12:19.4000923Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T14:12:19.4001122Z 2025-12-04T14:12:19.4001257Z Finished distributed/test_distributed_spawn 6/7 ... [2025-12-04 14:12:19.397427][2242649.88740414], took 15.41min 2025-12-04T14:12:19.4001702Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:12:19.4008652Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:12:19.4011157Z Running distributed/test_serialization 1/1 ... [2025-12-04 14:12:19.401012][2242649.890993541] 2025-12-04T14:12:19.4011360Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:12:19.4013315Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_serialization.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:12:19.401218] 2025-12-04T14:12:21.8700897Z 2025-12-04T14:12:21.8701736Z distributed/test_serialization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_serialization_1.1_83081683e64d7056_.log 2025-12-04T14:12:21.8705361Z Running 11 items in this shard: test/distributed/test_serialization.py::TestSerialization::test_cuda, test/distributed/test_serialization.py::TestSerialization::test_dtensor, test/distributed/test_serialization.py::TestSerialization::test_empty_tensor, test/distributed/test_serialization.py::TestSerialization::test_nested_tensors, test/distributed/test_serialization.py::TestSerialization::test_python_object, test/distributed/test_serialization.py::TestSerialization::test_scalar_tensor, test/distributed/test_serialization.py::TestSerialization::test_str_utf8, test/distributed/test_serialization.py::TestSerialization::test_strided_tensor, test/distributed/test_serialization.py::TestSerialization::test_tensor_with_offset, test/distributed/test_serialization.py::TestSerialization::test_various_data_types, test/distributed/test_serialization.py::TestSerialization::test_weights_only 2025-12-04T14:12:21.8708106Z 2025-12-04T14:12:21.8708385Z Finished distributed/test_serialization 1/1 ... [2025-12-04 14:12:21.869682][2242652.35965825], took 0.04min 2025-12-04T14:12:21.8724002Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:12:21.8739352Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:12:21.8741551Z Running distributed/fsdp/test_fsdp_multiple_wrapping 1/1 ... [2025-12-04 14:12:21.873979][2242652.36396014] 2025-12-04T14:12:21.8741909Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:12:21.8743071Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_wrapping.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:12:21.874185] 2025-12-04T14:12:53.8893592Z 2025-12-04T14:12:53.8894849Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping 1/1 (test/test-reports/distributed.fsdp.test_fsdp_multiple_wrapping_1.1_f0f6f63a5540ee38_.log) 2025-12-04T14:12:53.8896302Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-7fa2c45225f915fb.xml 2025-12-04T14:12:53.8897362Z ============================= test session starts ============================== 2025-12-04T14:12:53.8897983Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:12:53.8898525Z cachedir: .pytest_cache 2025-12-04T14:12:53.8899143Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:12:53.8899809Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:12:53.8900131Z configfile: pytest.ini 2025-12-04T14:12:53.8901009Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:12:53.8901374Z collecting ... collected 1 item 2025-12-04T14:12:53.8901520Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T14:12:53.8901851Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T14:12:53.8902072Z 2025-12-04T14:12:53.8902390Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 14:12:23.645000 348146 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 348215 2025-12-04T14:12:53.8903500Z I1204 14:12:23.645000 348146 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 348216 2025-12-04T14:12:53.8903848Z I1204 14:12:23.646000 348146 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 348217 2025-12-04T14:12:53.8904187Z I1204 14:12:23.646000 348146 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 348218 2025-12-04T14:12:53.8905018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8905659Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8906244Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8906822Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8907411Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8907998Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8908575Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8909153Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8909394Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.8909738Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.8910289Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8910775Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.8911253Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8911700Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.8912151Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8912624Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8913151Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8913611Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8914113Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8914566Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.8915021Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8915487Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.8916138Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T14:12:53.8916739Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8917090Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8917683Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8918186Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8918554Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8918969Z [rank2]:E1204 14:12:29.906000 348217 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:12:53.8919212Z dist init r=2, world=4 2025-12-04T14:12:53.8919416Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.8919752Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.8920282Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8920761Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.8921242Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8921690Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.8922172Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8922635Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8923095Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8923587Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8924047Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8924495Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.8924948Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8925409Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.8926046Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T14:12:53.8926644Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8926992Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8927578Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8928082Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8928447Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8928863Z [rank3]:E1204 14:12:29.912000 348218 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:12:53.8929209Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.8929547Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.8930031Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8930562Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.8931041Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8931526Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.8931964Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8932423Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8932934Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8933395Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8933859Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8934305Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.8934757Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8935220Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.8935859Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496. 2025-12-04T14:12:53.8936458Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8936804Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8937391Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8937888Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8938258Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8938668Z [rank0]:E1204 14:12:29.913000 348215 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:12:53.8938911Z dist init r=3, world=4 2025-12-04T14:12:53.8939013Z dist init r=0, world=4 2025-12-04T14:12:53.8939210Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.8939548Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.8940029Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8940576Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.8941052Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8941500Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.8941974Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8942434Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8942897Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8943357Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8943819Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8944267Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.8944717Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8945180Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.8945848Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T14:12:53.8946457Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8946804Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8947386Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8947887Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8948249Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8948662Z [rank1]:E1204 14:12:29.974000 348216 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:12:53.8948906Z dist init r=1, world=4 2025-12-04T14:12:53.8949008Z FAILED [7.4162s] [100%] 2025-12-04T14:12:53.8949073Z 2025-12-04T14:12:53.8949131Z =================================== FAILURES =================================== 2025-12-04T14:12:53.8949325Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________ 2025-12-04T14:12:53.8949546Z Traceback (most recent call last): 2025-12-04T14:12:53.8949791Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:12:53.8950034Z self._join_processes(fn) 2025-12-04T14:12:53.8950329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:12:53.8950594Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:12:53.8950859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:12:53.8951150Z raise RuntimeError(error) 2025-12-04T14:12:53.8951302Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:12:53.8951467Z Traceback (most recent call last): 2025-12-04T14:12:53.8951704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8951950Z getattr(self, test_name)() 2025-12-04T14:12:53.8952181Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8952415Z fn() 2025-12-04T14:12:53.8952616Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8952848Z method(*args, **kwargs) 2025-12-04T14:12:53.8953069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8953301Z method(*args, **kwargs) 2025-12-04T14:12:53.8953517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8953743Z with policy(): 2025-12-04T14:12:53.8953952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8954188Z raise RuntimeError(msg) 2025-12-04T14:12:53.8954586Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496. 2025-12-04T14:12:53.8954947Z 2025-12-04T14:12:53.8955024Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8955363Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8955625Z 2025-12-04T14:12:53.8955713Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8955840Z 2025-12-04T14:12:53.8955897Z Process 2 exited with error code 10 and exception: 2025-12-04T14:12:53.8956041Z Traceback (most recent call last): 2025-12-04T14:12:53.8956278Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8956520Z getattr(self, test_name)() 2025-12-04T14:12:53.8956749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8956982Z fn() 2025-12-04T14:12:53.8957184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8957411Z method(*args, **kwargs) 2025-12-04T14:12:53.8957632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8957858Z method(*args, **kwargs) 2025-12-04T14:12:53.8958075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8958343Z with policy(): 2025-12-04T14:12:53.8958555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8958787Z raise RuntimeError(msg) 2025-12-04T14:12:53.8959183Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T14:12:53.8959547Z 2025-12-04T14:12:53.8959624Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8959985Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8960286Z 2025-12-04T14:12:53.8960376Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8960506Z 2025-12-04T14:12:53.8960563Z Process 3 exited with error code 10 and exception: 2025-12-04T14:12:53.8960704Z Traceback (most recent call last): 2025-12-04T14:12:53.8960944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8961185Z getattr(self, test_name)() 2025-12-04T14:12:53.8961414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8961647Z fn() 2025-12-04T14:12:53.8961848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8962075Z method(*args, **kwargs) 2025-12-04T14:12:53.8962292Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8962517Z method(*args, **kwargs) 2025-12-04T14:12:53.8962735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8963062Z with policy(): 2025-12-04T14:12:53.8963270Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8963498Z raise RuntimeError(msg) 2025-12-04T14:12:53.8963890Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T14:12:53.8964254Z 2025-12-04T14:12:53.8964330Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8964664Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8964924Z 2025-12-04T14:12:53.8965014Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8965136Z 2025-12-04T14:12:53.8965141Z 2025-12-04T14:12:53.8965220Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:12:53.8965425Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:12:53.8965820Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-7fa2c45225f915fb.xml - 2025-12-04T14:12:53.8966192Z =========================== short test summary info ============================ 2025-12-04T14:12:53.8966538Z FAILED [7.4162s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:12:53.8966862Z Traceback (most recent call last): 2025-12-04T14:12:53.8967155Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8967398Z getattr(self, test_name)() 2025-12-04T14:12:53.8967630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8967860Z fn() 2025-12-04T14:12:53.8968058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8968288Z method(*args, **kwargs) 2025-12-04T14:12:53.8968538Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8968768Z method(*args, **kwargs) 2025-12-04T14:12:53.8968985Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8969215Z with policy(): 2025-12-04T14:12:53.8969423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8969655Z raise RuntimeError(msg) 2025-12-04T14:12:53.8970052Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496. 2025-12-04T14:12:53.8970454Z 2025-12-04T14:12:53.8970527Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8970870Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8971131Z 2025-12-04T14:12:53.8971217Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8971350Z 2025-12-04T14:12:53.8971408Z Process 2 exited with error code 10 and exception: 2025-12-04T14:12:53.8971552Z Traceback (most recent call last): 2025-12-04T14:12:53.8971790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8972030Z getattr(self, test_name)() 2025-12-04T14:12:53.8972260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8972493Z fn() 2025-12-04T14:12:53.8972691Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8972921Z method(*args, **kwargs) 2025-12-04T14:12:53.8973137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8973364Z method(*args, **kwargs) 2025-12-04T14:12:53.8973582Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8973806Z with policy(): 2025-12-04T14:12:53.8974014Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8974247Z raise RuntimeError(msg) 2025-12-04T14:12:53.8974639Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T14:12:53.8974999Z 2025-12-04T14:12:53.8975074Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8975406Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8975733Z 2025-12-04T14:12:53.8975820Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8975945Z 2025-12-04T14:12:53.8976002Z Process 3 exited with error code 10 and exception: 2025-12-04T14:12:53.8976142Z Traceback (most recent call last): 2025-12-04T14:12:53.8976382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8976626Z getattr(self, test_name)() 2025-12-04T14:12:53.8976856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8977087Z fn() 2025-12-04T14:12:53.8977322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8977554Z method(*args, **kwargs) 2025-12-04T14:12:53.8977772Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8978006Z method(*args, **kwargs) 2025-12-04T14:12:53.8978227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8978453Z with policy(): 2025-12-04T14:12:53.8978662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8978892Z raise RuntimeError(msg) 2025-12-04T14:12:53.8979288Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T14:12:53.8979656Z 2025-12-04T14:12:53.8979731Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8980061Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8980362Z 2025-12-04T14:12:53.8980448Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8980637Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:12:53.8980801Z ============================== 1 failed in 7.43s =============================== 2025-12-04T14:12:53.8980935Z Got exit code 1 2025-12-04T14:12:53.8981035Z Retrying single test... 2025-12-04T14:12:53.8981328Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-8fee7c372441c531.xml 2025-12-04T14:12:53.8981647Z ============================= test session starts ============================== 2025-12-04T14:12:53.8981861Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:12:53.8982053Z cachedir: .pytest_cache 2025-12-04T14:12:53.8982274Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:12:53.8982514Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:12:53.8982634Z configfile: pytest.ini 2025-12-04T14:12:53.8982862Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:12:53.8983106Z collecting ... collected 1 item 2025-12-04T14:12:53.8983403Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T14:12:53.8983696Z Running 1 items in this shard 2025-12-04T14:12:53.8983767Z 2025-12-04T14:12:53.8984077Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 14:12:33.542000 348540 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 348609 2025-12-04T14:12:53.8984605Z I1204 14:12:33.543000 348540 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 348610 2025-12-04T14:12:53.8984949Z I1204 14:12:33.543000 348540 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 348611 2025-12-04T14:12:53.8985288Z I1204 14:12:33.544000 348540 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 348612 2025-12-04T14:12:53.8986004Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8986594Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8987183Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8987766Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8988351Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8988928Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8989564Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.8990142Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.8990425Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.8990767Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.8991258Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.8991742Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.8992222Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.8992672Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.8993114Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8993577Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8994079Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.8994539Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.8995006Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.8995485Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.8995942Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.8996411Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.8997054Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T14:12:53.8997657Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8998004Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.8998589Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.8999095Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.8999459Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.8999872Z [rank2]:E1204 14:12:39.679000 348611 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:12:53.9000115Z dist init r=2, world=4 2025-12-04T14:12:53.9000358Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.9000693Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.9001181Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9001657Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.9002133Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9002579Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.9003018Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9003524Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9003985Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9004479Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9004942Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9005396Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.9005848Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9006311Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.9006954Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T14:12:53.9007553Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9007903Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9008490Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9008989Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9009356Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9009773Z [rank1]:E1204 14:12:39.698000 348610 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:12:53.9010019Z dist init r=1, world=4 2025-12-04T14:12:53.9010241Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.9010580Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.9011068Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9011548Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.9012022Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9012505Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.9012941Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9013403Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9013896Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9014355Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9014822Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9015271Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.9015722Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9016187Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.9016827Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496. 2025-12-04T14:12:53.9017427Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9017773Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9018358Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9018859Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9019222Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9019635Z [rank0]:E1204 14:12:39.702000 348609 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:12:53.9019878Z dist init r=0, world=4 2025-12-04T14:12:53.9020081Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.9020453Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.9020949Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9021426Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.9021932Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9022380Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.9022851Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9023314Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9023775Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9024240Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9024701Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9025152Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.9025611Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9026075Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.9026716Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T14:12:53.9027324Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9027676Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9028258Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9028762Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9029126Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9029539Z [rank3]:E1204 14:12:39.704000 348612 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:12:53.9029784Z dist init r=3, world=4 2025-12-04T14:12:53.9029889Z FAILED [7.2167s] [100%] 2025-12-04T14:12:53.9029957Z 2025-12-04T14:12:53.9030017Z =================================== FAILURES =================================== 2025-12-04T14:12:53.9030255Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________ 2025-12-04T14:12:53.9030438Z Traceback (most recent call last): 2025-12-04T14:12:53.9030716Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:12:53.9030960Z self._join_processes(fn) 2025-12-04T14:12:53.9031206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:12:53.9031470Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:12:53.9031740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:12:53.9031998Z raise RuntimeError(error) 2025-12-04T14:12:53.9032187Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:12:53.9032350Z Traceback (most recent call last): 2025-12-04T14:12:53.9032588Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9032833Z getattr(self, test_name)() 2025-12-04T14:12:53.9033063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9033296Z fn() 2025-12-04T14:12:53.9033498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9033729Z method(*args, **kwargs) 2025-12-04T14:12:53.9033952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9034181Z method(*args, **kwargs) 2025-12-04T14:12:53.9034402Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9034628Z with policy(): 2025-12-04T14:12:53.9034840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9035075Z raise RuntimeError(msg) 2025-12-04T14:12:53.9035474Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T14:12:53.9035844Z 2025-12-04T14:12:53.9035918Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9036259Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9036523Z 2025-12-04T14:12:53.9036613Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9036740Z 2025-12-04T14:12:53.9036742Z 2025-12-04T14:12:53.9036820Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:12:53.9037022Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:12:53.9037419Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-8fee7c372441c531.xml - 2025-12-04T14:12:53.9037785Z =========================== short test summary info ============================ 2025-12-04T14:12:53.9038130Z FAILED [7.2167s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:12:53.9038457Z Traceback (most recent call last): 2025-12-04T14:12:53.9038705Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9038948Z getattr(self, test_name)() 2025-12-04T14:12:53.9039179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9039446Z fn() 2025-12-04T14:12:53.9039647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9039879Z method(*args, **kwargs) 2025-12-04T14:12:53.9040099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9040379Z method(*args, **kwargs) 2025-12-04T14:12:53.9040600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9040827Z with policy(): 2025-12-04T14:12:53.9041074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9041309Z raise RuntimeError(msg) 2025-12-04T14:12:53.9041707Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T14:12:53.9042077Z 2025-12-04T14:12:53.9042151Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9042488Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9042752Z 2025-12-04T14:12:53.9042840Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9043032Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:12:53.9043192Z ============================== 1 failed in 7.23s =============================== 2025-12-04T14:12:53.9043330Z Got exit code 1 2025-12-04T14:12:53.9043429Z Retrying single test... 2025-12-04T14:12:53.9043718Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-198131612e61d01a.xml 2025-12-04T14:12:53.9044043Z ============================= test session starts ============================== 2025-12-04T14:12:53.9044252Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:12:53.9044441Z cachedir: .pytest_cache 2025-12-04T14:12:53.9044662Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:12:53.9044902Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:12:53.9045022Z configfile: pytest.ini 2025-12-04T14:12:53.9045251Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:12:53.9045495Z collecting ... collected 1 item 2025-12-04T14:12:53.9045785Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T14:12:53.9046082Z Running 1 items in this shard 2025-12-04T14:12:53.9046157Z 2025-12-04T14:12:53.9046460Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 14:12:43.273000 348934 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 349003 2025-12-04T14:12:53.9046951Z I1204 14:12:43.273000 348934 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 349004 2025-12-04T14:12:53.9047295Z I1204 14:12:43.274000 348934 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 349005 2025-12-04T14:12:53.9047634Z I1204 14:12:43.274000 348934 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 349006 2025-12-04T14:12:53.9048313Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.9048939Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.9049544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.9050128Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.9050745Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.9051320Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.9051897Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T14:12:53.9052469Z device_from_device_id = _get_device_from_device_id( 2025-12-04T14:12:53.9052708Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.9053051Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.9053540Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9054018Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.9054505Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9054952Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.9055392Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9055855Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9056320Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9056794Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9057254Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9057740Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.9058191Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9058653Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.9059325Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T14:12:53.9059936Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9060320Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9060913Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9061418Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9061784Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9062203Z [rank1]:E1204 14:12:49.443000 349004 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:12:53.9062449Z dist init r=1, world=4 2025-12-04T14:12:53.9062652Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.9062985Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.9063467Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9063947Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.9064419Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9064865Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.9065300Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9065756Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9066215Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9066674Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9067165Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9067616Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.9077875Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9078391Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.9079043Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T14:12:53.9079653Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9080001Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9080641Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9081143Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9081508Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9081922Z [rank3]:E1204 14:12:49.448000 349006 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:12:53.9082171Z dist init r=3, world=4 2025-12-04T14:12:53.9082713Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.9083052Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.9083536Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9084013Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.9084492Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9084946Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.9085390Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9085859Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9086329Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9086841Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9087307Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9087796Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.9088263Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9088735Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.9089381Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T14:12:53.9089989Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9090382Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9091117Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9091628Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9091998Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9092417Z [rank2]:E1204 14:12:49.514000 349005 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:12:53.9092672Z dist init r=2, world=4 2025-12-04T14:12:53.9092884Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:12:53.9093231Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:12:53.9093723Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9094207Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:12:53.9094693Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9095147Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:12:53.9095584Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9096092Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9096560Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9097028Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:12:53.9097532Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9097989Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:12:53.9098450Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9098925Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:12:53.9099574Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496. 2025-12-04T14:12:53.9100214Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9100573Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9101162Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9101670Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:12:53.9102042Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9102466Z [rank0]:E1204 14:12:49.518000 349003 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:12:53.9102715Z dist init r=0, world=4 2025-12-04T14:12:53.9102823Z FAILED [7.5209s] [100%] 2025-12-04T14:12:53.9102895Z 2025-12-04T14:12:53.9102958Z =================================== FAILURES =================================== 2025-12-04T14:12:53.9103160Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________ 2025-12-04T14:12:53.9103348Z Traceback (most recent call last): 2025-12-04T14:12:53.9103600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:12:53.9103850Z self._join_processes(fn) 2025-12-04T14:12:53.9104104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:12:53.9104374Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:12:53.9104647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:12:53.9104946Z raise RuntimeError(error) 2025-12-04T14:12:53.9105109Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:12:53.9105281Z Traceback (most recent call last): 2025-12-04T14:12:53.9105527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9105774Z getattr(self, test_name)() 2025-12-04T14:12:53.9106010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9106252Z fn() 2025-12-04T14:12:53.9106490Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9106731Z method(*args, **kwargs) 2025-12-04T14:12:53.9106963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9107201Z method(*args, **kwargs) 2025-12-04T14:12:53.9107432Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9107666Z with policy(): 2025-12-04T14:12:53.9107884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9108121Z raise RuntimeError(msg) 2025-12-04T14:12:53.9108526Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T14:12:53.9108897Z 2025-12-04T14:12:53.9108975Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9109319Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9109593Z 2025-12-04T14:12:53.9109686Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9109816Z 2025-12-04T14:12:53.9109879Z Process 3 exited with error code 10 and exception: 2025-12-04T14:12:53.9110028Z Traceback (most recent call last): 2025-12-04T14:12:53.9110322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9110575Z getattr(self, test_name)() 2025-12-04T14:12:53.9110813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9111055Z fn() 2025-12-04T14:12:53.9111262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9111498Z method(*args, **kwargs) 2025-12-04T14:12:53.9111725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9111963Z method(*args, **kwargs) 2025-12-04T14:12:53.9112186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9112416Z with policy(): 2025-12-04T14:12:53.9112631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9112871Z raise RuntimeError(msg) 2025-12-04T14:12:53.9113280Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T14:12:53.9113646Z 2025-12-04T14:12:53.9113723Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9114108Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9114374Z 2025-12-04T14:12:53.9114465Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9114595Z 2025-12-04T14:12:53.9114597Z 2025-12-04T14:12:53.9114680Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:12:53.9114895Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:12:53.9115330Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-198131612e61d01a.xml - 2025-12-04T14:12:53.9115710Z =========================== short test summary info ============================ 2025-12-04T14:12:53.9116061Z FAILED [7.5209s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:12:53.9116397Z Traceback (most recent call last): 2025-12-04T14:12:53.9116644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9116887Z getattr(self, test_name)() 2025-12-04T14:12:53.9117118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9117353Z fn() 2025-12-04T14:12:53.9117561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9117794Z method(*args, **kwargs) 2025-12-04T14:12:53.9118012Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9118240Z method(*args, **kwargs) 2025-12-04T14:12:53.9118460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9118686Z with policy(): 2025-12-04T14:12:53.9118896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9119127Z raise RuntimeError(msg) 2025-12-04T14:12:53.9119524Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T14:12:53.9119886Z 2025-12-04T14:12:53.9119963Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9120332Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9120595Z 2025-12-04T14:12:53.9120684Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9120811Z 2025-12-04T14:12:53.9120869Z Process 3 exited with error code 10 and exception: 2025-12-04T14:12:53.9121012Z Traceback (most recent call last): 2025-12-04T14:12:53.9121252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:12:53.9121494Z getattr(self, test_name)() 2025-12-04T14:12:53.9121725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:12:53.9121957Z fn() 2025-12-04T14:12:53.9122160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9122389Z method(*args, **kwargs) 2025-12-04T14:12:53.9122611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:12:53.9122879Z method(*args, **kwargs) 2025-12-04T14:12:53.9123098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:12:53.9123326Z with policy(): 2025-12-04T14:12:53.9123536Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:12:53.9123767Z raise RuntimeError(msg) 2025-12-04T14:12:53.9124200Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T14:12:53.9124564Z 2025-12-04T14:12:53.9124640Z To execute this test, run the following from the base repo dir: 2025-12-04T14:12:53.9124977Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T14:12:53.9125245Z 2025-12-04T14:12:53.9125335Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:12:53.9125528Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:12:53.9125698Z ============================== 1 failed in 7.53s =============================== 2025-12-04T14:12:53.9125836Z Got exit code 1 2025-12-04T14:12:53.9126071Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T14:12:53.9126411Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:12:53.9126806Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-df9a6e7ed5898a56.xml 2025-12-04T14:12:53.9127132Z ============================= test session starts ============================== 2025-12-04T14:12:53.9127348Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:12:53.9127539Z cachedir: .pytest_cache 2025-12-04T14:12:53.9127766Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:12:53.9128011Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:12:53.9128133Z configfile: pytest.ini 2025-12-04T14:12:53.9128366Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:12:53.9128638Z collecting ... collected 1 item / 1 deselected / 0 selected 2025-12-04T14:12:53.9128798Z stepcurrent: skipping 1 already run items. 2025-12-04T14:12:53.9128931Z Running 0 items in this shard 2025-12-04T14:12:53.9129003Z 2025-12-04T14:12:53.9129279Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-df9a6e7ed5898a56.xml - 2025-12-04T14:12:53.9129648Z ============================ 1 deselected in 0.00s ============================= 2025-12-04T14:12:53.9129950Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda'] 2025-12-04T14:12:53.9130225Z 2025-12-04T14:12:53.9130449Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping 1/1 (test/test-reports/distributed.fsdp.test_fsdp_multiple_wrapping_1.1_f0f6f63a5540ee38_.log) 2025-12-04T14:12:53.9130706Z 2025-12-04T14:12:53.9130851Z Finished distributed/fsdp/test_fsdp_multiple_wrapping 1/1 ... [2025-12-04 14:12:53.889490][2242684.379466174], took 0.53min 2025-12-04T14:12:53.9131292Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:12:53.9131712Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:12:53.9131933Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T14:12:53.9132115Z Uploading artifacts took 0.00 seconds 2025-12-04T14:12:53.9132270Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 failed! 2025-12-04T14:12:53.9132508Z Running distributed/_composable/fsdp/test_fully_shard_comm 1/1 ... [2025-12-04 14:12:53.893266][2242684.383246991] 2025-12-04T14:12:53.9132729Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:12:53.9133186Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_comm.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:12:53.893468] 2025-12-04T14:16:11.3805351Z 2025-12-04T14:16:11.3805731Z distributed/_composable/fsdp/test_fully_shard_comm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_comm_1.1_f5e9a826ce5275ac_.log 2025-12-04T14:16:11.3816522Z Running 22 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardCollectiveOps::test_all_gather_fp32, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardCollectiveOps::test_reduce_scatter_fp16, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardCollectiveOps::test_reduce_scatter_fp32, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardCommunication::test_fully_shard_communication_count, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardCommunication::test_manual_reshard_with_reshard_after_forward_false, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardCommunication::test_set_reduce_scatter_divide_factor, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardCommunication::test_set_reshard_after_forward, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardPrefetch::test_backward_misprefetch, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardPrefetch::test_fully_shard_backward_prefetch, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardPrefetch::test_fully_shard_multi_module_backward_prefetch, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardPrefetch::test_fully_shard_multi_module_unused_module, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardPrefetch::test_set_modules_to_backward_prefetch, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardPrefetch::test_set_modules_to_backward_prefetch_inside_ac, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardPrefetch::test_set_modules_to_forward_prefetch, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardUnshardMultiProcess::test_unshard_async, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardUnshardMultiThread::test_unshard_no_param_group, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardUnshardMultiThread::test_unshard_without_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardAllocFromPG::test_exception_when_used_together_with_comm_hooks, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardAllocFromPG::test_fully_shard_alloc_from_pg, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardForceSumReduction::test_fully_shard_force_sum_both_reductions, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardForceSumReduction::test_fully_shard_force_sum_reduce_scatter, test/distributed/_composable/fsdp/test_fully_shard_comm.py::TestFullyShardReduceOpWorldSize1::test_size1_reduceop 2025-12-04T14:16:11.3835877Z 2025-12-04T14:16:11.3836451Z Finished distributed/_composable/fsdp/test_fully_shard_comm 1/1 ... [2025-12-04 14:16:11.383446][2242881.873424015], took 3.29min 2025-12-04T14:16:11.3854012Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:16:11.3869971Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:16:11.3872745Z Running distributed/checkpoint/test_file_system_checkpoint 1/1 ... [2025-12-04 14:16:11.387091][2242881.877071844] 2025-12-04T14:16:11.3872992Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:16:11.3874210Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_file_system_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:16:11.387286] 2025-12-04T14:16:48.0104778Z 2025-12-04T14:16:48.0108636Z distributed/checkpoint/test_file_system_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_file_system_checkpoint_1.1_a112ef163ac7b93c_.log 2025-12-04T14:16:48.0111009Z Running 9 items in this shard: test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedStateDictSaveLoad::test_read_write_only_tensor, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedStateDictSaveLoadWithSharedTensor::test_read_write_shard_tensor_extensions0, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedStateDictSaveLoadWithSharedTensor::test_read_write_shard_tensor_extensions1, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedStateDictSaveLoadWithSharedTensor::test_read_write_shard_tensor_extensions2, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedReshardOnLoad::test_load_rowwise_to_colwise, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedReshardOnLoad::test_load_with_different_shard_plan, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedReshardOnLoad::test_save_load_bytes, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedReshardOnLoad::test_switch_between_sharded_tensor_to_tensor, test/distributed/checkpoint/test_file_system_checkpoint.py::TestDistributedStateDictSaveLoadWithCaching::test_read_write_shard_tensor 2025-12-04T14:16:48.0112950Z 2025-12-04T14:16:48.0113110Z Finished distributed/checkpoint/test_file_system_checkpoint 1/1 ... [2025-12-04 14:16:48.010235][2242918.500210994], took 0.61min 2025-12-04T14:16:48.0128987Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:16:48.0145600Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:16:48.0148545Z Running distributed/test_composability 1/1 ... [2025-12-04 14:16:48.014688][2242918.504668472] 2025-12-04T14:16:48.0149894Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:16:48.0150791Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_composability.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:16:48.014882] 2025-12-04T14:17:01.1521866Z 2025-12-04T14:17:01.1522779Z distributed/test_composability 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_composability_1.1_d407f22a62507148_.log 2025-12-04T14:17:01.1527780Z Running 13 items in this shard: test/distributed/test_composability.py::ComposabilityTest::test_pp_ddp_ScheduleClass0, test/distributed/test_composability.py::ComposabilityTest::test_pp_ddp_ScheduleClass1, test/distributed/test_composability.py::ComposabilityTest::test_pp_ddp_ScheduleClass2, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_MP_ScheduleClass0, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_MP_ScheduleClass1, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_MP_ScheduleClass2, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_MP_ScheduleClass3, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_ScheduleClass0, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_ScheduleClass1, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_ScheduleClass2, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_dp_type_FSDP_ScheduleClass3, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_unshard_reshard_runtime_dp_type_FSDP, test/distributed/test_composability.py::ComposabilityTest::test_pp_fsdp_unshard_reshard_runtime_dp_type_FSDP_MP 2025-12-04T14:17:01.1531811Z 2025-12-04T14:17:01.1532034Z Finished distributed/test_composability 1/1 ... [2025-12-04 14:17:01.151844][2242931.641819682], took 0.22min 2025-12-04T14:17:01.1544307Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:17:01.1560350Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:17:01.1562731Z Running distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 ... [2025-12-04 14:17:01.156161][2242931.646141271] 2025-12-04T14:17:01.1563033Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:17:01.1564742Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_dtensor_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:17:01.156356] 2025-12-04T14:25:34.2853750Z 2025-12-04T14:25:34.2855640Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_fsdp_dtensor_state_dict_1.1_b117e26eea61d004_.log) 2025-12-04T14:25:34.2857075Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3be82ed78da61f1d.xml 2025-12-04T14:25:34.2858419Z ============================= test session starts ============================== 2025-12-04T14:25:34.2859034Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.2859625Z cachedir: .pytest_cache 2025-12-04T14:25:34.2860357Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.2860798Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.2860934Z configfile: pytest.ini 2025-12-04T14:25:34.2861222Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.2861801Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.2862302Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.2862743Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.2863196Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.2863338Z collected 15 items 2025-12-04T14:25:34.2863474Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T14:25:34.2868416Z Running 15 items in this shard: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.2872452Z 2025-12-04T14:25:34.2872867Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:17:02.962000 358180 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 358249 2025-12-04T14:25:34.2873472Z I1204 14:17:02.962000 358180 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 358250 2025-12-04T14:25:34.2873820Z I1204 14:17:02.963000 358180 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 358251 2025-12-04T14:25:34.2874165Z I1204 14:17:02.964000 358180 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 358252 2025-12-04T14:25:34.2875080Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2875916Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2876681Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2877488Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2878237Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2878993Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2879749Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2880555Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2881913Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2883376Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2884914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2886473Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2887937Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2889366Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2890835Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2892262Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2892559Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.2892887Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.2893385Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.2893861Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.2894332Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.2894763Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.2895237Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2895686Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2896136Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2896631Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2897084Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.2897524Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.2898025Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.2898476Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.2899227Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 803209216 and is now 2843738112. 2025-12-04T14:25:34.2899931Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2900335Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.2901046Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.2901655Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2902019Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.2902427Z E1204 14:17:10.803000 358252 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.2902758Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.2903081Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.2903570Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.2904041Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.2904508Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.2904952Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.2905381Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2905834Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2906340Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2906797Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2907289Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.2907726Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.2908173Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.2908653Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.2909442Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.2910145Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2910588Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.2911354Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.2911956Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2912303Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.2912705Z E1204 14:17:10.869000 358249 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.2913034Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.2913357Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.2913831Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.2914297Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.2914765Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.2915199Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.2915622Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2916114Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2916566Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2942763Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2943422Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.2943877Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.2944325Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.2944777Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.2945529Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.2946233Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2946579Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.2947275Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.2947891Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2948249Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.2948652Z E1204 14:17:10.892000 358250 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.2948987Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.2949314Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.2949796Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.2950322Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.2950793Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.2951282Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.2951706Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2952150Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2952625Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2953078Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.2953525Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.2953961Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.2954403Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.2954852Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.2955582Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.2956283Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2956620Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.2957304Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.2957914Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.2958269Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.2958672Z E1204 14:17:10.893000 358251 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.2958913Z FAILED [9.2240s] [ 6%] 2025-12-04T14:25:34.2958981Z 2025-12-04T14:25:34.2959047Z =================================== FAILURES =================================== 2025-12-04T14:25:34.2959334Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.2959612Z Traceback (most recent call last): 2025-12-04T14:25:34.2959861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.2960109Z self._join_processes(fn) 2025-12-04T14:25:34.2960399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.2960716Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.2960987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.2961249Z raise RuntimeError(error) 2025-12-04T14:25:34.2961407Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.2961573Z Traceback (most recent call last): 2025-12-04T14:25:34.2961813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.2962089Z getattr(self, test_name)() 2025-12-04T14:25:34.2962325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.2962562Z fn() 2025-12-04T14:25:34.2962767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2963008Z method(*args, **kwargs) 2025-12-04T14:25:34.2963234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2963468Z method(*args, **kwargs) 2025-12-04T14:25:34.2963693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.2963927Z with policy(): 2025-12-04T14:25:34.2964147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.2964389Z raise RuntimeError(msg) 2025-12-04T14:25:34.2964899Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 803209216 and is now 2843738112. 2025-12-04T14:25:34.2965374Z 2025-12-04T14:25:34.2965458Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.2965914Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.2966291Z 2025-12-04T14:25:34.2966388Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.2966514Z 2025-12-04T14:25:34.2966519Z 2025-12-04T14:25:34.2966600Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.2966814Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.2967220Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3be82ed78da61f1d.xml - 2025-12-04T14:25:34.2967598Z =========================== short test summary info ============================ 2025-12-04T14:25:34.2968062Z FAILED [9.2240s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.2968505Z Traceback (most recent call last): 2025-12-04T14:25:34.2968763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.2969014Z getattr(self, test_name)() 2025-12-04T14:25:34.2969248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.2969504Z fn() 2025-12-04T14:25:34.2969709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2969944Z method(*args, **kwargs) 2025-12-04T14:25:34.2970210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.2970445Z method(*args, **kwargs) 2025-12-04T14:25:34.2970668Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.2970898Z with policy(): 2025-12-04T14:25:34.2971153Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.2971389Z raise RuntimeError(msg) 2025-12-04T14:25:34.2971891Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 803209216 and is now 2843738112. 2025-12-04T14:25:34.2972355Z 2025-12-04T14:25:34.2972431Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.2972884Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.2973261Z 2025-12-04T14:25:34.2973358Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.2973552Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.2973717Z ============================== 1 failed in 9.39s =============================== 2025-12-04T14:25:34.2973856Z Got exit code 1 2025-12-04T14:25:34.2973964Z Retrying single test... 2025-12-04T14:25:34.2974259Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-d39cd143d71f0d28.xml 2025-12-04T14:25:34.2974576Z ============================= test session starts ============================== 2025-12-04T14:25:34.2974787Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.2974973Z cachedir: .pytest_cache 2025-12-04T14:25:34.2975199Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.2975444Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.2975568Z configfile: pytest.ini 2025-12-04T14:25:34.2975796Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.2976354Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.2976799Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.2977226Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.2977665Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.2977815Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.2978238Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.2978683Z Running 1 items in this shard 2025-12-04T14:25:34.2978758Z 2025-12-04T14:25:34.2979176Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:17:14.916000 358582 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 358651 2025-12-04T14:25:34.2979778Z I1204 14:17:14.917000 358582 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 358652 2025-12-04T14:25:34.2980152Z I1204 14:17:14.917000 358582 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 358653 2025-12-04T14:25:34.2980556Z I1204 14:17:14.918000 358582 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 358654 2025-12-04T14:25:34.2981432Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2982189Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2982946Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2983702Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2984447Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2985196Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2985935Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.2986680Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.2988125Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2989587Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2991108Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2992527Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2993956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2995363Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2996781Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.2998193Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.2998491Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.2998812Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.2999311Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.2999783Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3000287Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3000746Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3001167Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3001616Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3002063Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3002514Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3002966Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3003402Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3003835Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3004287Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3005031Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3005730Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3006070Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3006765Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3007376Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3007736Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3008142Z E1204 14:17:22.747000 358654 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3008476Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3008838Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3009305Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3009772Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3010297Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3010733Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3011163Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3011613Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3012065Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3012509Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3012954Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3013386Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3013819Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3014264Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3014990Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3015676Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3016008Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3016687Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3017285Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3017629Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3018058Z E1204 14:17:22.751000 358652 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3018379Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3018696Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3019192Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3019657Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3020118Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3020596Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3021018Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3021465Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3021914Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3022362Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3022813Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3023247Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3023682Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3024135Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3024867Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3025563Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3025895Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3026580Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3027181Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3027561Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3027959Z E1204 14:17:22.763000 358653 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3028287Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3028611Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3029117Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3029587Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3030054Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3030519Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3030946Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3031391Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3031840Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3032287Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3032737Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3033171Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3033609Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3034064Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3034796Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3035487Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3035823Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3036509Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3037150Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3037498Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3037895Z E1204 14:17:22.797000 358651 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3038131Z FAILED [9.1193s] [100%] 2025-12-04T14:25:34.3038197Z 2025-12-04T14:25:34.3038286Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3038572Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3038848Z Traceback (most recent call last): 2025-12-04T14:25:34.3039092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3039341Z self._join_processes(fn) 2025-12-04T14:25:34.3039588Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3039855Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3040128Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3040431Z raise RuntimeError(error) 2025-12-04T14:25:34.3040588Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3040754Z Traceback (most recent call last): 2025-12-04T14:25:34.3040992Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3041241Z getattr(self, test_name)() 2025-12-04T14:25:34.3041473Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3041707Z fn() 2025-12-04T14:25:34.3041911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3042145Z method(*args, **kwargs) 2025-12-04T14:25:34.3042367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3042599Z method(*args, **kwargs) 2025-12-04T14:25:34.3042819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3043049Z with policy(): 2025-12-04T14:25:34.3043264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3043498Z raise RuntimeError(msg) 2025-12-04T14:25:34.3044004Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3044473Z 2025-12-04T14:25:34.3044549Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3045006Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3045379Z 2025-12-04T14:25:34.3045471Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3045966Z 2025-12-04T14:25:34.3045968Z 2025-12-04T14:25:34.3046050Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3046255Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3046651Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-d39cd143d71f0d28.xml - 2025-12-04T14:25:34.3047020Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3047505Z FAILED [9.1193s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3047940Z Traceback (most recent call last): 2025-12-04T14:25:34.3048186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3048434Z getattr(self, test_name)() 2025-12-04T14:25:34.3048668Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3048905Z fn() 2025-12-04T14:25:34.3049110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3049340Z method(*args, **kwargs) 2025-12-04T14:25:34.3049558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3049792Z method(*args, **kwargs) 2025-12-04T14:25:34.3050010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3050282Z with policy(): 2025-12-04T14:25:34.3050498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3050734Z raise RuntimeError(msg) 2025-12-04T14:25:34.3051238Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3051706Z 2025-12-04T14:25:34.3051783Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3052236Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3052613Z 2025-12-04T14:25:34.3052707Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3052901Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3053074Z ======================= 1 failed, 14 deselected in 9.28s ======================= 2025-12-04T14:25:34.3053219Z Got exit code 1 2025-12-04T14:25:34.3053323Z Retrying single test... 2025-12-04T14:25:34.3053620Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-43f2eb96a39bd225.xml 2025-12-04T14:25:34.3053948Z ============================= test session starts ============================== 2025-12-04T14:25:34.3054168Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3054362Z cachedir: .pytest_cache 2025-12-04T14:25:34.3054591Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3054875Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3055000Z configfile: pytest.ini 2025-12-04T14:25:34.3055238Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3055812Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3056262Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3056738Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3057184Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3057340Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3057767Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3058182Z Running 1 items in this shard 2025-12-04T14:25:34.3058257Z 2025-12-04T14:25:34.3058676Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:17:26.510000 358984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359053 2025-12-04T14:25:34.3059269Z I1204 14:17:26.510000 358984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359054 2025-12-04T14:25:34.3059620Z I1204 14:17:26.511000 358984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359055 2025-12-04T14:25:34.3059966Z I1204 14:17:26.511000 358984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359056 2025-12-04T14:25:34.3060907Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3061670Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3062415Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3063164Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3063904Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3064648Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3065394Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3066182Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3067552Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3068980Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3070447Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3071874Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3073296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3074710Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3076167Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3077612Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3077908Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3078235Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3078706Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3079173Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3079638Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3080068Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3080532Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3080984Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3081439Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3081887Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3082337Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3082779Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3083222Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3083675Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3084420Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3085149Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3085491Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3086213Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3086819Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3087173Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3087577Z E1204 14:17:33.900000 359053 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3087906Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3088228Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3088704Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3089173Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3089643Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3090078Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3090545Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3090998Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3091449Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3091900Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3092350Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3092794Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3093229Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3093670Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3094401Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3095128Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3095469Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3096189Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3096795Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3097142Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3097539Z E1204 14:17:33.933000 359055 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3097865Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3098185Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3098655Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3099118Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3099578Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3100010Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3100497Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3100950Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3101397Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3101845Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3102290Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3102727Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3103164Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3103648Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3104376Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 803209216 and is now 2843738112. 2025-12-04T14:25:34.3105067Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3105432Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3106121Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3106721Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3107068Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3107465Z E1204 14:17:33.938000 359056 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3107790Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3108110Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3108582Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3109048Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3109510Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3109944Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3110414Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3110868Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3111318Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3111766Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3112217Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3112653Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3113090Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3113569Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3114326Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3115020Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3115355Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3116053Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3116664Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3117017Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3117415Z E1204 14:17:33.943000 359054 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3117651Z FAILED [8.6196s] [100%] 2025-12-04T14:25:34.3117719Z 2025-12-04T14:25:34.3117778Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3118064Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3118334Z Traceback (most recent call last): 2025-12-04T14:25:34.3118583Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3118829Z self._join_processes(fn) 2025-12-04T14:25:34.3119077Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3119342Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3119611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3119874Z raise RuntimeError(error) 2025-12-04T14:25:34.3120026Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3120227Z Traceback (most recent call last): 2025-12-04T14:25:34.3120471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3120715Z getattr(self, test_name)() 2025-12-04T14:25:34.3120948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3121181Z fn() 2025-12-04T14:25:34.3121385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3121617Z method(*args, **kwargs) 2025-12-04T14:25:34.3121838Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3122068Z method(*args, **kwargs) 2025-12-04T14:25:34.3122330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3122559Z with policy(): 2025-12-04T14:25:34.3122771Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3123002Z raise RuntimeError(msg) 2025-12-04T14:25:34.3123541Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3124012Z 2025-12-04T14:25:34.3124088Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3124539Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3124923Z 2025-12-04T14:25:34.3125014Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3125139Z 2025-12-04T14:25:34.3125141Z 2025-12-04T14:25:34.3125221Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3125428Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3125831Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-43f2eb96a39bd225.xml - 2025-12-04T14:25:34.3126198Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3126650Z FAILED [8.6196s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3127089Z Traceback (most recent call last): 2025-12-04T14:25:34.3127334Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3127577Z getattr(self, test_name)() 2025-12-04T14:25:34.3127812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3128046Z fn() 2025-12-04T14:25:34.3128253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3128485Z method(*args, **kwargs) 2025-12-04T14:25:34.3128705Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3128940Z method(*args, **kwargs) 2025-12-04T14:25:34.3129158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3129384Z with policy(): 2025-12-04T14:25:34.3129595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3129826Z raise RuntimeError(msg) 2025-12-04T14:25:34.3130372Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3130840Z 2025-12-04T14:25:34.3130919Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3131411Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3131788Z 2025-12-04T14:25:34.3131879Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3132068Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3132236Z ======================= 1 failed, 14 deselected in 8.79s ======================= 2025-12-04T14:25:34.3132377Z Got exit code 1 2025-12-04T14:25:34.3132765Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3133219Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.3133612Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-d689752dd366bfb0.xml 2025-12-04T14:25:34.3133930Z ============================= test session starts ============================== 2025-12-04T14:25:34.3134143Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3134334Z cachedir: .pytest_cache 2025-12-04T14:25:34.3134558Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3134800Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3134923Z configfile: pytest.ini 2025-12-04T14:25:34.3135151Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3135713Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3136165Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3136601Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3137044Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3137191Z collected 15 items / 1 deselected / 14 selected 2025-12-04T14:25:34.3137338Z stepcurrent: skipping 1 already run items. 2025-12-04T14:25:34.3137468Z Running 14 items in this shard 2025-12-04T14:25:34.3137540Z 2025-12-04T14:25:34.3137957Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:17:37.766000 359386 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359455 2025-12-04T14:25:34.3138560Z I1204 14:17:37.767000 359386 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359456 2025-12-04T14:25:34.3138908Z I1204 14:17:37.767000 359386 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359457 2025-12-04T14:25:34.3139251Z I1204 14:17:37.768000 359386 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359458 2025-12-04T14:25:34.3140132Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3140969Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3141753Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3142510Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3143257Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3144013Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3144754Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3145497Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3146842Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3148261Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3149686Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3151201Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3152667Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3154090Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3155510Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3156935Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3157232Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3157565Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3158046Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3158515Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3158980Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3159412Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3159839Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3160329Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3160783Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3161268Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3161720Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3162187Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3162632Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3163085Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3163822Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3164520Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3164859Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3165548Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3166154Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3166504Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3166904Z E1204 14:17:45.154000 359455 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3167232Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3167554Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3168028Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3168494Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3168960Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3169393Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3169817Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3170337Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3170788Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3171235Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3171719Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3172162Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3172605Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3173060Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3173795Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3174489Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3174828Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3175513Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3176109Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3176454Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3176848Z E1204 14:17:45.155000 359456 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3177171Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3177491Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3177963Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3178427Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3178891Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3179320Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3179774Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3180265Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3180718Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3181198Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3181653Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3182099Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3182543Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3182998Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3183837Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3184534Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3184871Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3185558Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3186156Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3186501Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3186898Z E1204 14:17:45.156000 359457 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3187221Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3187539Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3188012Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3188525Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3189026Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3189586Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3190040Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3190555Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3191100Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3191582Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3192070Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3192552Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3193018Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3194142Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3194904Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 803209216 and is now 2843738112. 2025-12-04T14:25:34.3195624Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3196011Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3196740Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3197384Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3197762Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3198181Z E1204 14:17:45.165000 359458 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3198470Z FAILED [8.7208s] [ 7%] 2025-12-04T14:25:34.3198544Z 2025-12-04T14:25:34.3198625Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3198951Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3199259Z Traceback (most recent call last): 2025-12-04T14:25:34.3199532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3199820Z self._join_processes(fn) 2025-12-04T14:25:34.3200131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3200467Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3200785Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3201075Z raise RuntimeError(error) 2025-12-04T14:25:34.3201268Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3201465Z Traceback (most recent call last): 2025-12-04T14:25:34.3201774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3202058Z getattr(self, test_name)() 2025-12-04T14:25:34.3202323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3202583Z fn() 2025-12-04T14:25:34.3202829Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3203094Z method(*args, **kwargs) 2025-12-04T14:25:34.3203349Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3203619Z method(*args, **kwargs) 2025-12-04T14:25:34.3203866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3204131Z with policy(): 2025-12-04T14:25:34.3204382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3204641Z raise RuntimeError(msg) 2025-12-04T14:25:34.3205206Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3205687Z 2025-12-04T14:25:34.3205785Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3206272Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3206682Z 2025-12-04T14:25:34.3206784Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3206934Z 2025-12-04T14:25:34.3206936Z 2025-12-04T14:25:34.3207023Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3207272Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3207697Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-d689752dd366bfb0.xml - 2025-12-04T14:25:34.3208111Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3208613Z FAILED [8.7208s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3210121Z Traceback (most recent call last): 2025-12-04T14:25:34.3210443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3210722Z getattr(self, test_name)() 2025-12-04T14:25:34.3210984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3211300Z fn() 2025-12-04T14:25:34.3211530Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3211790Z method(*args, **kwargs) 2025-12-04T14:25:34.3212053Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3212317Z method(*args, **kwargs) 2025-12-04T14:25:34.3212567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3212830Z with policy(): 2025-12-04T14:25:34.3213106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3213376Z raise RuntimeError(msg) 2025-12-04T14:25:34.3213918Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3214412Z 2025-12-04T14:25:34.3214492Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3215005Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3215389Z 2025-12-04T14:25:34.3215504Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3215735Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3215939Z ======================= 1 failed, 1 deselected in 8.88s ======================== 2025-12-04T14:25:34.3216105Z Got exit code 1 2025-12-04T14:25:34.3216254Z Retrying single test... 2025-12-04T14:25:34.3216574Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c2824199c1feb447.xml 2025-12-04T14:25:34.3216930Z ============================= test session starts ============================== 2025-12-04T14:25:34.3217181Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3217401Z cachedir: .pytest_cache 2025-12-04T14:25:34.3217676Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3217946Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3218089Z configfile: pytest.ini 2025-12-04T14:25:34.3218364Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3218962Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3219443Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3219914Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3220423Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3220619Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3221069Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3221541Z Running 1 items in this shard 2025-12-04T14:25:34.3221645Z 2025-12-04T14:25:34.3222081Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:17:49.000000 359788 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359857 2025-12-04T14:25:34.3222713Z I1204 14:17:49.001000 359788 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359858 2025-12-04T14:25:34.3223133Z I1204 14:17:49.002000 359788 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359859 2025-12-04T14:25:34.3223511Z I1204 14:17:49.002000 359788 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359860 2025-12-04T14:25:34.3224423Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3225219Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3226000Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3226799Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3227576Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3228354Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3229132Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3229910Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3231310Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3232828Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3234300Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3235785Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3237239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3238688Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3240142Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3241642Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3241983Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3242335Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3242881Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3243380Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3243948Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3244423Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3244883Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3245359Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3245859Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3246349Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3246839Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3247306Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3247773Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3248273Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3249035Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1254096896 and is now 2843738112. 2025-12-04T14:25:34.3249763Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3250141Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3250903Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3251551Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3251934Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3252356Z E1204 14:17:56.795000 359860 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3252774Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3253125Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3256046Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3256575Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3257068Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3273081Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3273546Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3273999Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3274448Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3274896Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3275340Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3275773Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3276206Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3276651Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3277380Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3278070Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3278404Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3279084Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3279676Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3280022Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3280522Z E1204 14:17:56.866000 359858 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3280846Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3281168Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3281663Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3282124Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3282583Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3283013Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3283434Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3283879Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3284325Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3284768Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3285211Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3285638Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3286073Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3286514Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3287238Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3287919Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3288249Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3288931Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3289558Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3289899Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3290322Z E1204 14:17:56.876000 359857 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3290644Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3290991Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3291454Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3291917Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3292372Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3292796Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3293216Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3293656Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3294095Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3294536Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3294980Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3295410Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3295846Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3296291Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3297016Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3297701Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3298035Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3301124Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3301742Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3302086Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3302477Z E1204 14:17:56.892000 359859 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3302740Z FAILED [9.0256s] [100%] 2025-12-04T14:25:34.3302808Z 2025-12-04T14:25:34.3302866Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3303167Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3303430Z Traceback (most recent call last): 2025-12-04T14:25:34.3303673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3303916Z self._join_processes(fn) 2025-12-04T14:25:34.3304158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3304416Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3304684Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3304946Z raise RuntimeError(error) 2025-12-04T14:25:34.3305097Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3305255Z Traceback (most recent call last): 2025-12-04T14:25:34.3305493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3305735Z getattr(self, test_name)() 2025-12-04T14:25:34.3305964Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3306191Z fn() 2025-12-04T14:25:34.3306389Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3306618Z method(*args, **kwargs) 2025-12-04T14:25:34.3306836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3307066Z method(*args, **kwargs) 2025-12-04T14:25:34.3307280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3307501Z with policy(): 2025-12-04T14:25:34.3307709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3307938Z raise RuntimeError(msg) 2025-12-04T14:25:34.3308432Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1254096896 and is now 2843738112. 2025-12-04T14:25:34.3308892Z 2025-12-04T14:25:34.3308967Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3309411Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3309845Z 2025-12-04T14:25:34.3309935Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3310075Z 2025-12-04T14:25:34.3310077Z 2025-12-04T14:25:34.3310155Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3310396Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3310788Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c2824199c1feb447.xml - 2025-12-04T14:25:34.3311152Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3311631Z FAILED [9.0256s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3312055Z Traceback (most recent call last): 2025-12-04T14:25:34.3312298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3312536Z getattr(self, test_name)() 2025-12-04T14:25:34.3312764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3312993Z fn() 2025-12-04T14:25:34.3313188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3313413Z method(*args, **kwargs) 2025-12-04T14:25:34.3313631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3313856Z method(*args, **kwargs) 2025-12-04T14:25:34.3314071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3314294Z with policy(): 2025-12-04T14:25:34.3314499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3314724Z raise RuntimeError(msg) 2025-12-04T14:25:34.3315221Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1254096896 and is now 2843738112. 2025-12-04T14:25:34.3315683Z 2025-12-04T14:25:34.3315757Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3316206Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3316578Z 2025-12-04T14:25:34.3316665Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3316851Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3317014Z ======================= 1 failed, 14 deselected in 9.19s ======================= 2025-12-04T14:25:34.3317151Z Got exit code 1 2025-12-04T14:25:34.3317245Z Retrying single test... 2025-12-04T14:25:34.3317530Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-25d3faee1784bf36.xml 2025-12-04T14:25:34.3317847Z ============================= test session starts ============================== 2025-12-04T14:25:34.3318057Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3318274Z cachedir: .pytest_cache 2025-12-04T14:25:34.3318493Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3318745Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3318860Z configfile: pytest.ini 2025-12-04T14:25:34.3319086Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3319645Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3320080Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3320575Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3321009Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3321152Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3321568Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3321968Z Running 1 items in this shard 2025-12-04T14:25:34.3322039Z 2025-12-04T14:25:34.3322455Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:18:00.966000 360190 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360259 2025-12-04T14:25:34.3323047Z I1204 14:18:00.967000 360190 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360260 2025-12-04T14:25:34.3323391Z I1204 14:18:00.968000 360190 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360261 2025-12-04T14:25:34.3323730Z I1204 14:18:00.968000 360190 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360262 2025-12-04T14:25:34.3324603Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3325345Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3326079Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3326820Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3327554Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3328303Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3329043Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3329776Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3331187Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3332596Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3334001Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3335393Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3336799Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3338196Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3339651Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3341109Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3341396Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3341715Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3342183Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3342644Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3343103Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3343529Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3343947Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3344391Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3344836Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3345274Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3345716Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3346148Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3346581Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3347025Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3347751Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3348466Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3348798Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3349509Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3350108Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3350489Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3350889Z E1204 14:18:08.827000 360259 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3351213Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3351531Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3352000Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3352466Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3352929Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3353360Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3353782Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3354229Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3354674Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3355119Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3355566Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3355998Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3356436Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3356881Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3357620Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1264582656 and is now 2843738112. 2025-12-04T14:25:34.3358322Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3358653Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3359361Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3359957Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3360345Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3360742Z E1204 14:18:08.879000 360262 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3361069Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3361388Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3361854Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3362315Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3362776Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3363206Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3363631Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3364075Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3364519Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3364960Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3365406Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3365838Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3366276Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3366753Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3367475Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1245708288 and is now 2843738112. 2025-12-04T14:25:34.3368186Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3368518Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3369195Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3369788Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3370134Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3370571Z E1204 14:18:08.912000 360261 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3370892Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3371212Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3371679Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3372138Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3372598Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3373032Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3373452Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3373898Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3374344Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3374787Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3375229Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3375678Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3376129Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3376704Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3377456Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3378143Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3378474Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3379150Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3379746Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3380096Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3380523Z E1204 14:18:08.922000 360260 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3380761Z FAILED [9.0200s] [100%] 2025-12-04T14:25:34.3380827Z 2025-12-04T14:25:34.3380884Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3381159Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3381423Z Traceback (most recent call last): 2025-12-04T14:25:34.3381663Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3381909Z self._join_processes(fn) 2025-12-04T14:25:34.3382156Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3382418Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3382687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3382949Z raise RuntimeError(error) 2025-12-04T14:25:34.3383101Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3383264Z Traceback (most recent call last): 2025-12-04T14:25:34.3383505Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3383745Z getattr(self, test_name)() 2025-12-04T14:25:34.3383975Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3384207Z fn() 2025-12-04T14:25:34.3384413Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3384641Z method(*args, **kwargs) 2025-12-04T14:25:34.3384882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3385127Z method(*args, **kwargs) 2025-12-04T14:25:34.3385346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3385571Z with policy(): 2025-12-04T14:25:34.3385781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3386011Z raise RuntimeError(msg) 2025-12-04T14:25:34.3386541Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3387004Z 2025-12-04T14:25:34.3387080Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3387526Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3387902Z 2025-12-04T14:25:34.3387990Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3388117Z 2025-12-04T14:25:34.3388119Z 2025-12-04T14:25:34.3388196Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3388401Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3388801Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-25d3faee1784bf36.xml - 2025-12-04T14:25:34.3389170Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3389616Z FAILED [9.0200s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3390046Z Traceback (most recent call last): 2025-12-04T14:25:34.3390322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3390564Z getattr(self, test_name)() 2025-12-04T14:25:34.3390794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3391028Z fn() 2025-12-04T14:25:34.3391226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3391272Z method(*args, **kwargs) 2025-12-04T14:25:34.3391423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3391467Z method(*args, **kwargs) 2025-12-04T14:25:34.3391617Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3391653Z with policy(): 2025-12-04T14:25:34.3391807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3391847Z raise RuntimeError(msg) 2025-12-04T14:25:34.3392283Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3392300Z 2025-12-04T14:25:34.3392375Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3392726Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3392729Z 2025-12-04T14:25:34.3392814Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3392879Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3392940Z ======================= 1 failed, 14 deselected in 9.18s ======================= 2025-12-04T14:25:34.3392980Z Got exit code 1 2025-12-04T14:25:34.3393295Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3393426Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.3393654Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-d17a1becdac728d4.xml 2025-12-04T14:25:34.3393713Z ============================= test session starts ============================== 2025-12-04T14:25:34.3393829Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3393871Z cachedir: .pytest_cache 2025-12-04T14:25:34.3394029Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3394077Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3394121Z configfile: pytest.ini 2025-12-04T14:25:34.3394283Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3394643Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3394694Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3395040Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3395096Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3395155Z collected 15 items / 2 deselected / 13 selected 2025-12-04T14:25:34.3395207Z stepcurrent: skipping 2 already run items. 2025-12-04T14:25:34.3395252Z Running 13 items in this shard 2025-12-04T14:25:34.3395255Z 2025-12-04T14:25:34.3395661Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:18:12.812000 360592 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360661 2025-12-04T14:25:34.3395819Z I1204 14:18:12.812000 360592 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360662 2025-12-04T14:25:34.3395972Z I1204 14:18:12.813000 360592 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360663 2025-12-04T14:25:34.3396120Z I1204 14:18:12.813000 360592 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360664 2025-12-04T14:25:34.3396801Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3396875Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3397561Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3397605Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3398275Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3398321Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3398989Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3399033Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3400334Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3400461Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3401716Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3401864Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3403138Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3403259Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3404506Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3404630Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3404763Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3404919Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3405202Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3405356Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3405637Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3405751Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3406023Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3406164Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3406457Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3406596Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3406864Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3407012Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3407283Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3407428Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3407980Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3408093Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3408281Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3408745Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3408857Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3409060Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3409221Z E1204 14:18:20.151000 360661 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3409349Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3409502Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3409780Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3409928Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3410239Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3410358Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3410626Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3410792Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3411061Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3411198Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3411494Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3411621Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3411895Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3412033Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3412585Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3412694Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3412883Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3413340Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3413446Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3413652Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3413807Z E1204 14:18:20.166000 360662 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3413940Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3414092Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3414371Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3415052Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3415332Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3415468Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3415746Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3415889Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3416156Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3416317Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3416587Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3416716Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3416988Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3417128Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3417677Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1251999744 and is now 2820669440. 2025-12-04T14:25:34.3417785Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3417976Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3418430Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3418536Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3418740Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3418899Z E1204 14:18:20.183000 360663 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3419032Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3419182Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3419465Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3419611Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3419899Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3420026Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3420564Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3420706Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3421003Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3421148Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3421415Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3421548Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3421815Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3421958Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3422509Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3422616Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3422806Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3423257Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3423367Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3423569Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3423727Z E1204 14:18:20.218000 360664 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3423766Z FAILED [8.5192s] [ 7%] 2025-12-04T14:25:34.3423771Z 2025-12-04T14:25:34.3423829Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3424014Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3424061Z Traceback (most recent call last): 2025-12-04T14:25:34.3424225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3424302Z self._join_processes(fn) 2025-12-04T14:25:34.3424478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3424531Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3424711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3424755Z raise RuntimeError(error) 2025-12-04T14:25:34.3424839Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3424884Z Traceback (most recent call last): 2025-12-04T14:25:34.3425070Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3425113Z getattr(self, test_name)() 2025-12-04T14:25:34.3425279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3425316Z fn() 2025-12-04T14:25:34.3425470Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3425511Z method(*args, **kwargs) 2025-12-04T14:25:34.3425662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3425702Z method(*args, **kwargs) 2025-12-04T14:25:34.3425854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3425891Z with policy(): 2025-12-04T14:25:34.3426047Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3426088Z raise RuntimeError(msg) 2025-12-04T14:25:34.3426520Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3426523Z 2025-12-04T14:25:34.3426598Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3426935Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3426937Z 2025-12-04T14:25:34.3427028Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3427030Z 2025-12-04T14:25:34.3427032Z 2025-12-04T14:25:34.3427108Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3427200Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3427472Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-d17a1becdac728d4.xml - 2025-12-04T14:25:34.3427536Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3427880Z FAILED [8.5192s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3427930Z Traceback (most recent call last): 2025-12-04T14:25:34.3428093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3428152Z getattr(self, test_name)() 2025-12-04T14:25:34.3428311Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3428361Z fn() 2025-12-04T14:25:34.3428510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3428553Z method(*args, **kwargs) 2025-12-04T14:25:34.3428703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3428746Z method(*args, **kwargs) 2025-12-04T14:25:34.3428897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3428960Z with policy(): 2025-12-04T14:25:34.3429115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3429158Z raise RuntimeError(msg) 2025-12-04T14:25:34.3429587Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3429590Z 2025-12-04T14:25:34.3429664Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3430003Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3430005Z 2025-12-04T14:25:34.3430091Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3430159Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3430258Z ======================= 1 failed, 2 deselected in 8.68s ======================== 2025-12-04T14:25:34.3430299Z Got exit code 1 2025-12-04T14:25:34.3430339Z Retrying single test... 2025-12-04T14:25:34.3430566Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9a0ce4ea1a88d4bb.xml 2025-12-04T14:25:34.3430623Z ============================= test session starts ============================== 2025-12-04T14:25:34.3430739Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3430780Z cachedir: .pytest_cache 2025-12-04T14:25:34.3430943Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3430988Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3431032Z configfile: pytest.ini 2025-12-04T14:25:34.3431195Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3431559Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3431613Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3431956Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3432018Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3432074Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3432419Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3432477Z Running 1 items in this shard 2025-12-04T14:25:34.3432479Z 2025-12-04T14:25:34.3432884Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:18:23.969000 360994 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 361063 2025-12-04T14:25:34.3433039Z I1204 14:18:23.970000 360994 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 361064 2025-12-04T14:25:34.3433218Z I1204 14:18:23.970000 360994 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 361065 2025-12-04T14:25:34.3433376Z I1204 14:18:23.971000 360994 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 361066 2025-12-04T14:25:34.3434059Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3434106Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3434784Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3434832Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3435502Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3435546Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3436217Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3436260Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3437545Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3437695Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3438965Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3439090Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3440380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3440506Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3441757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3441877Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3442013Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3442193Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3442477Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3442627Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3442932Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3443050Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3443319Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3443463Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3443730Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3443874Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3444143Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3444275Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3444548Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3444689Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3445243Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 950009856 and is now 2820669440. 2025-12-04T14:25:34.3445353Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3445546Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3445999Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3446110Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3446314Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3446484Z E1204 14:18:31.305000 361066 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3446628Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3446779Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3447060Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3447234Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3447513Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3447629Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3447900Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3448040Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3448311Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3448454Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3448724Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3448856Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3449127Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3449269Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3449817Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3449930Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3450122Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3450610Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3450721Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3450938Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3451110Z E1204 14:18:31.392000 361064 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3451238Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3451392Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3451694Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3451844Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3452124Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3452237Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3452505Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3452645Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3452915Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3453055Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3453323Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3453455Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3453725Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3453868Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3454417Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3454528Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3454715Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3455173Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3455303Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3455505Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3455663Z E1204 14:18:31.398000 361063 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3455794Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3455966Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3456241Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3456391Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3456665Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3456780Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3457048Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3457190Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3457460Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3457599Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3457872Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3458000Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3458272Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3458412Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3458957Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3459073Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3459261Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3459723Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3459842Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3460045Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3460260Z E1204 14:18:31.415000 361065 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3460302Z FAILED [8.6215s] [100%] 2025-12-04T14:25:34.3460305Z 2025-12-04T14:25:34.3460361Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3460547Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3460598Z Traceback (most recent call last): 2025-12-04T14:25:34.3460759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3460806Z self._join_processes(fn) 2025-12-04T14:25:34.3460978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3461032Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3461210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3461256Z raise RuntimeError(error) 2025-12-04T14:25:34.3461335Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3461382Z Traceback (most recent call last): 2025-12-04T14:25:34.3461542Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3461587Z getattr(self, test_name)() 2025-12-04T14:25:34.3461744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3461781Z fn() 2025-12-04T14:25:34.3461931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3461973Z method(*args, **kwargs) 2025-12-04T14:25:34.3462123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3462163Z method(*args, **kwargs) 2025-12-04T14:25:34.3462312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3462352Z with policy(): 2025-12-04T14:25:34.3462504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3462546Z raise RuntimeError(msg) 2025-12-04T14:25:34.3462973Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 950009856 and is now 2820669440. 2025-12-04T14:25:34.3462975Z 2025-12-04T14:25:34.3463052Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3463387Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3463404Z 2025-12-04T14:25:34.3463511Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3463513Z 2025-12-04T14:25:34.3463515Z 2025-12-04T14:25:34.3463592Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3463679Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3463955Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9a0ce4ea1a88d4bb.xml - 2025-12-04T14:25:34.3464014Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3464381Z FAILED [8.6215s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3464431Z Traceback (most recent call last): 2025-12-04T14:25:34.3464598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3464639Z getattr(self, test_name)() 2025-12-04T14:25:34.3464799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3464833Z fn() 2025-12-04T14:25:34.3464985Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3465026Z method(*args, **kwargs) 2025-12-04T14:25:34.3465178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3465217Z method(*args, **kwargs) 2025-12-04T14:25:34.3465369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3465406Z with policy(): 2025-12-04T14:25:34.3465558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3465601Z raise RuntimeError(msg) 2025-12-04T14:25:34.3466027Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 950009856 and is now 2820669440. 2025-12-04T14:25:34.3466029Z 2025-12-04T14:25:34.3466106Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3466440Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3466444Z 2025-12-04T14:25:34.3466531Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3466594Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3466659Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T14:25:34.3466695Z Got exit code 1 2025-12-04T14:25:34.3466737Z Retrying single test... 2025-12-04T14:25:34.3466961Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ccbcb16b8ef88d41.xml 2025-12-04T14:25:34.3467024Z ============================= test session starts ============================== 2025-12-04T14:25:34.3467137Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3467194Z cachedir: .pytest_cache 2025-12-04T14:25:34.3467350Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3467408Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3467448Z configfile: pytest.ini 2025-12-04T14:25:34.3467612Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3467969Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3468040Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3468385Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3468443Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3468499Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3468825Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3468870Z Running 1 items in this shard 2025-12-04T14:25:34.3468872Z 2025-12-04T14:25:34.3469275Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:18:35.388000 361396 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 361465 2025-12-04T14:25:34.3469435Z I1204 14:18:35.389000 361396 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 361466 2025-12-04T14:25:34.3469586Z I1204 14:18:35.389000 361396 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 361467 2025-12-04T14:25:34.3469738Z I1204 14:18:35.390000 361396 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 361468 2025-12-04T14:25:34.3470458Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3470503Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3471171Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3471213Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3471882Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3471963Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3472628Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3472672Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3473968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3474096Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3475351Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3475478Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3476733Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3476863Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3478148Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3478271Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3478403Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3478558Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3478839Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3478988Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3479267Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3479382Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3479653Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3479795Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3480064Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3480239Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3480513Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3480640Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3480909Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3481052Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3481599Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1256194048 and is now 2820669440. 2025-12-04T14:25:34.3481744Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3481932Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3482415Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3482525Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3482728Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3482889Z E1204 14:18:42.706000 361468 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3483018Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3483172Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3483450Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3483598Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3483874Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3483989Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3484257Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3484400Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3484669Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3484808Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3485077Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3485205Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3485476Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3485613Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3486179Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3486286Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3486496Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3486949Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3487057Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3487260Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3487415Z E1204 14:18:42.719000 361466 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3487547Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3487698Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3487982Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3488131Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3488410Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3488525Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3488792Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3488936Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3489201Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3489341Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3489609Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3489738Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3490014Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3490207Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3490755Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3490887Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3491078Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3491531Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3491639Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3491842Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3491998Z E1204 14:18:42.753000 361465 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3492130Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3492280Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3492560Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3492704Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3492982Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3493095Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3493366Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3493508Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3493774Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3493915Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3494182Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3494325Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3494606Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3494747Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3495314Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3495425Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3495617Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3496068Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3496173Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3496373Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3496529Z E1204 14:18:42.760000 361467 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3496568Z FAILED [8.7205s] [100%] 2025-12-04T14:25:34.3496570Z 2025-12-04T14:25:34.3496628Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3496808Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3496857Z Traceback (most recent call last): 2025-12-04T14:25:34.3497018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3497064Z self._join_processes(fn) 2025-12-04T14:25:34.3497237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3497290Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3497472Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3497516Z raise RuntimeError(error) 2025-12-04T14:25:34.3497596Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3497640Z Traceback (most recent call last): 2025-12-04T14:25:34.3497802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3497843Z getattr(self, test_name)() 2025-12-04T14:25:34.3498003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3498037Z fn() 2025-12-04T14:25:34.3498191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3498232Z method(*args, **kwargs) 2025-12-04T14:25:34.3498392Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3498442Z method(*args, **kwargs) 2025-12-04T14:25:34.3498592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3498629Z with policy(): 2025-12-04T14:25:34.3498781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3498822Z raise RuntimeError(msg) 2025-12-04T14:25:34.3499275Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3499279Z 2025-12-04T14:25:34.3499355Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3499690Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3499692Z 2025-12-04T14:25:34.3499782Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3499784Z 2025-12-04T14:25:34.3499786Z 2025-12-04T14:25:34.3499860Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3499949Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3500359Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ccbcb16b8ef88d41.xml - 2025-12-04T14:25:34.3500423Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3500769Z FAILED [8.7205s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3500817Z Traceback (most recent call last): 2025-12-04T14:25:34.3500981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3501025Z getattr(self, test_name)() 2025-12-04T14:25:34.3501185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3501225Z fn() 2025-12-04T14:25:34.3501374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3501417Z method(*args, **kwargs) 2025-12-04T14:25:34.3501566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3501610Z method(*args, **kwargs) 2025-12-04T14:25:34.3501758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3501797Z with policy(): 2025-12-04T14:25:34.3501947Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3501991Z raise RuntimeError(msg) 2025-12-04T14:25:34.3502419Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3502437Z 2025-12-04T14:25:34.3502512Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3502860Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3502862Z 2025-12-04T14:25:34.3502949Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3503015Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3503077Z ======================= 1 failed, 14 deselected in 8.88s ======================= 2025-12-04T14:25:34.3503142Z Got exit code 1 2025-12-04T14:25:34.3503422Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3503553Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.3503777Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-16404b3d76e33a54.xml 2025-12-04T14:25:34.3503836Z ============================= test session starts ============================== 2025-12-04T14:25:34.3503951Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3503993Z cachedir: .pytest_cache 2025-12-04T14:25:34.3504153Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3504199Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3504241Z configfile: pytest.ini 2025-12-04T14:25:34.3504403Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3504762Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3504814Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3505159Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3505216Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3505275Z collected 15 items / 3 deselected / 12 selected 2025-12-04T14:25:34.3505327Z stepcurrent: skipping 3 already run items. 2025-12-04T14:25:34.3505374Z Running 12 items in this shard 2025-12-04T14:25:34.3505375Z 2025-12-04T14:25:34.3505778Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:18:46.623000 361798 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 361867 2025-12-04T14:25:34.3505935Z I1204 14:18:46.624000 361798 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 361868 2025-12-04T14:25:34.3506087Z I1204 14:18:46.625000 361798 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 361869 2025-12-04T14:25:34.3506239Z I1204 14:18:46.625000 361798 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 361870 2025-12-04T14:25:34.3506925Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3506990Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3507678Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3507721Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3508389Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3508435Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3509096Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3509140Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3510444Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3510570Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3511823Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3511974Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3513260Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3513382Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3514625Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3514747Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3514878Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3515036Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3515317Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3515466Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3515745Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3515857Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3516127Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3516280Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3516564Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3516702Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3516972Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3517120Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3517391Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3517533Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3518079Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T14:25:34.3518192Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3518380Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3518839Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3518946Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3519149Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3519309Z E1204 14:18:53.912000 361870 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3519438Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3519592Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3519869Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3520016Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3520327Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3520442Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3520726Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3520881Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3521149Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3521287Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3521583Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3521714Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3521987Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3522126Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3522676Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3522786Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3522974Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3523426Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3523532Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3523735Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3523891Z E1204 14:18:53.939000 361869 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3524023Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3524171Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3524448Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3524594Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3524872Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3524998Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3525277Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3525419Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3525687Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3525852Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3526119Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3526250Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3526518Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3526656Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3527201Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3527308Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3527497Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3527953Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3528059Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3528262Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3528418Z E1204 14:18:53.958000 361867 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3528549Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3528699Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3528979Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3529121Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3529408Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3529539Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3529806Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3529944Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3530275Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3530417Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3530683Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3530811Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3531079Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3531220Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3531759Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3531867Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3532054Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3532509Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3532618Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3532818Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3532974Z E1204 14:18:54.014000 361868 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3533014Z FAILED [8.6194s] [ 8%] 2025-12-04T14:25:34.3533016Z 2025-12-04T14:25:34.3533073Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3533255Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3533303Z Traceback (most recent call last): 2025-12-04T14:25:34.3533479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3533540Z self._join_processes(fn) 2025-12-04T14:25:34.3533712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3533765Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3533943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3533986Z raise RuntimeError(error) 2025-12-04T14:25:34.3534067Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3534112Z Traceback (most recent call last): 2025-12-04T14:25:34.3534295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3534339Z getattr(self, test_name)() 2025-12-04T14:25:34.3534499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3534535Z fn() 2025-12-04T14:25:34.3534687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3534729Z method(*args, **kwargs) 2025-12-04T14:25:34.3534880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3534919Z method(*args, **kwargs) 2025-12-04T14:25:34.3535071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3535107Z with policy(): 2025-12-04T14:25:34.3535259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3535299Z raise RuntimeError(msg) 2025-12-04T14:25:34.3535726Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T14:25:34.3535729Z 2025-12-04T14:25:34.3535804Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3536141Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3536143Z 2025-12-04T14:25:34.3536233Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3536236Z 2025-12-04T14:25:34.3536237Z 2025-12-04T14:25:34.3536313Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3536402Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3536667Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-16404b3d76e33a54.xml - 2025-12-04T14:25:34.3536730Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3537076Z FAILED [8.6194s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3537127Z Traceback (most recent call last): 2025-12-04T14:25:34.3537292Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3537346Z getattr(self, test_name)() 2025-12-04T14:25:34.3537514Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3537552Z fn() 2025-12-04T14:25:34.3537702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3537744Z method(*args, **kwargs) 2025-12-04T14:25:34.3537893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3537935Z method(*args, **kwargs) 2025-12-04T14:25:34.3538103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3538143Z with policy(): 2025-12-04T14:25:34.3538292Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3538337Z raise RuntimeError(msg) 2025-12-04T14:25:34.3538764Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T14:25:34.3538766Z 2025-12-04T14:25:34.3538840Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3539173Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3539176Z 2025-12-04T14:25:34.3539261Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3539328Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3539390Z ======================= 1 failed, 3 deselected in 8.78s ======================== 2025-12-04T14:25:34.3539428Z Got exit code 1 2025-12-04T14:25:34.3539469Z Retrying single test... 2025-12-04T14:25:34.3539695Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-85447dc758d6d3f0.xml 2025-12-04T14:25:34.3539751Z ============================= test session starts ============================== 2025-12-04T14:25:34.3539864Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3539905Z cachedir: .pytest_cache 2025-12-04T14:25:34.3540067Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3540111Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3540155Z configfile: pytest.ini 2025-12-04T14:25:34.3540352Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3540712Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3540763Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3541106Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3541165Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3541221Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3541569Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3541626Z Running 1 items in this shard 2025-12-04T14:25:34.3541628Z 2025-12-04T14:25:34.3542030Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:18:57.751000 362200 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 362269 2025-12-04T14:25:34.3542211Z I1204 14:18:57.752000 362200 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 362270 2025-12-04T14:25:34.3542365Z I1204 14:18:57.753000 362200 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 362271 2025-12-04T14:25:34.3542513Z I1204 14:18:57.753000 362200 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 362272 2025-12-04T14:25:34.3543194Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3543242Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3543913Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3543961Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3544629Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3544672Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3545343Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3545385Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3546655Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3546810Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3548083Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3548207Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3549451Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3549576Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3550858Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3550979Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3551127Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3551291Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3551573Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3551717Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3552020Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3552137Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3552405Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3552548Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3552816Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3552957Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3553225Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3553354Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3553625Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3553766Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3554319Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3554427Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3554617Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3555068Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3555178Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3555381Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3555548Z E1204 14:19:05.015000 362269 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3555687Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3555838Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3556117Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3560004Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3560339Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3560461Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3560834Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3560975Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3561246Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3561385Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3561651Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3561779Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3562046Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3562185Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3562737Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3562847Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3563035Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3563495Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3563602Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3563833Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3564008Z E1204 14:19:05.036000 362271 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3564136Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3564286Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3564622Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3564770Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3565048Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3565160Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3565427Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3565567Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3565839Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3565978Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3566245Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3566371Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3566642Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3566783Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3567327Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3567438Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3567624Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3568080Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3568215Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3568417Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3568575Z E1204 14:19:05.087000 362272 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3568701Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3568872Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3569148Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3569298Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3569573Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3569689Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3569957Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3570097Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3570394Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3570531Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3570800Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3570927Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3571195Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3571337Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3571878Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3571986Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3572172Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3572637Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3572754Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3572954Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3573132Z E1204 14:19:05.093000 362270 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3573177Z FAILED [8.5200s] [100%] 2025-12-04T14:25:34.3573180Z 2025-12-04T14:25:34.3573238Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3573420Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3573467Z Traceback (most recent call last): 2025-12-04T14:25:34.3573630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3573674Z self._join_processes(fn) 2025-12-04T14:25:34.3573846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3573899Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3574078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3574123Z raise RuntimeError(error) 2025-12-04T14:25:34.3574203Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3574249Z Traceback (most recent call last): 2025-12-04T14:25:34.3574408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3574451Z getattr(self, test_name)() 2025-12-04T14:25:34.3574606Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3574641Z fn() 2025-12-04T14:25:34.3574790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3574833Z method(*args, **kwargs) 2025-12-04T14:25:34.3574982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3575022Z method(*args, **kwargs) 2025-12-04T14:25:34.3575169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3575208Z with policy(): 2025-12-04T14:25:34.3575358Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3575399Z raise RuntimeError(msg) 2025-12-04T14:25:34.3575824Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3575827Z 2025-12-04T14:25:34.3575903Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3576238Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3576261Z 2025-12-04T14:25:34.3576350Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3576352Z 2025-12-04T14:25:34.3576354Z 2025-12-04T14:25:34.3576432Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3576517Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3576785Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-85447dc758d6d3f0.xml - 2025-12-04T14:25:34.3576844Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3577210Z FAILED [8.5200s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3577257Z Traceback (most recent call last): 2025-12-04T14:25:34.3577423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3577466Z getattr(self, test_name)() 2025-12-04T14:25:34.3577624Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3577658Z fn() 2025-12-04T14:25:34.3577807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3577846Z method(*args, **kwargs) 2025-12-04T14:25:34.3577997Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3578036Z method(*args, **kwargs) 2025-12-04T14:25:34.3578186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3578222Z with policy(): 2025-12-04T14:25:34.3578373Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3578411Z raise RuntimeError(msg) 2025-12-04T14:25:34.3578839Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3578842Z 2025-12-04T14:25:34.3578918Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3579252Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3579256Z 2025-12-04T14:25:34.3579342Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3579404Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3579467Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T14:25:34.3579502Z Got exit code 1 2025-12-04T14:25:34.3579544Z Retrying single test... 2025-12-04T14:25:34.3579770Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f55e77dc8e118337.xml 2025-12-04T14:25:34.3579829Z ============================= test session starts ============================== 2025-12-04T14:25:34.3579942Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3579998Z cachedir: .pytest_cache 2025-12-04T14:25:34.3580154Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3580260Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3580300Z configfile: pytest.ini 2025-12-04T14:25:34.3580463Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3580822Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3580903Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3581250Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3581308Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3581365Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3581690Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3581734Z Running 1 items in this shard 2025-12-04T14:25:34.3581736Z 2025-12-04T14:25:34.3582140Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:19:08.890000 362602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 362671 2025-12-04T14:25:34.3582297Z I1204 14:19:08.891000 362602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 362672 2025-12-04T14:25:34.3582448Z I1204 14:19:08.891000 362602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 362673 2025-12-04T14:25:34.3582597Z I1204 14:19:08.892000 362602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 362674 2025-12-04T14:25:34.3583281Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3583324Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3584004Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3584047Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3584716Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3584785Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3585450Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3585491Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3586782Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3586911Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3588237Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3588361Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3589606Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3589743Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3591073Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T14:25:34.3591193Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T14:25:34.3591325Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3591479Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3591757Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3591905Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3592182Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3592299Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3592567Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3592707Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3592974Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3593112Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3593377Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3593502Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3593770Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3593909Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3594460Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3594597Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3594783Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3595256Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3595365Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3595568Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3595722Z E1204 14:19:16.238000 362671 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3595851Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3596001Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3596277Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3596422Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3596698Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3596811Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3597077Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3597215Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3597480Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3597620Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3597888Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3598013Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3598283Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3598422Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3598993Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3599100Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3599305Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3599757Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3599863Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3600065Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3600255Z E1204 14:19:16.247000 362673 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3600385Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3600535Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3600812Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3600956Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3601230Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3601343Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3601609Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3601747Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3602012Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3602149Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3602415Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3602544Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3602809Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3602979Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3603524Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3603654Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3603842Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3604292Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3604398Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3604599Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3604760Z E1204 14:19:16.251000 362672 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3604889Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3605039Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3605316Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3605460Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3605735Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3605847Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3606115Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3606253Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3606519Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3606656Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3606925Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3607064Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3607339Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3607477Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3608036Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 803209216 and is now 2820669440. 2025-12-04T14:25:34.3608144Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3608332Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3608780Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3608886Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3609086Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3609243Z E1204 14:19:16.290000 362674 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3609283Z FAILED [8.5201s] [100%] 2025-12-04T14:25:34.3609285Z 2025-12-04T14:25:34.3609342Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3609521Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3609568Z Traceback (most recent call last): 2025-12-04T14:25:34.3609729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3609773Z self._join_processes(fn) 2025-12-04T14:25:34.3609945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3609998Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3610292Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3610337Z raise RuntimeError(error) 2025-12-04T14:25:34.3610416Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3610461Z Traceback (most recent call last): 2025-12-04T14:25:34.3610682Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3610725Z getattr(self, test_name)() 2025-12-04T14:25:34.3610883Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3610917Z fn() 2025-12-04T14:25:34.3611069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3611109Z method(*args, **kwargs) 2025-12-04T14:25:34.3611282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3611334Z method(*args, **kwargs) 2025-12-04T14:25:34.3611483Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3611518Z with policy(): 2025-12-04T14:25:34.3611669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3611709Z raise RuntimeError(msg) 2025-12-04T14:25:34.3612182Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3612186Z 2025-12-04T14:25:34.3612260Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3612598Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3612601Z 2025-12-04T14:25:34.3612687Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3612689Z 2025-12-04T14:25:34.3612692Z 2025-12-04T14:25:34.3612767Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3612855Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3613128Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f55e77dc8e118337.xml - 2025-12-04T14:25:34.3613191Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3613537Z FAILED [8.5201s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3613582Z Traceback (most recent call last): 2025-12-04T14:25:34.3613746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3613787Z getattr(self, test_name)() 2025-12-04T14:25:34.3613946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3613982Z fn() 2025-12-04T14:25:34.3614133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3614175Z method(*args, **kwargs) 2025-12-04T14:25:34.3614323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3614364Z method(*args, **kwargs) 2025-12-04T14:25:34.3614511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3614548Z with policy(): 2025-12-04T14:25:34.3614697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3614737Z raise RuntimeError(msg) 2025-12-04T14:25:34.3615161Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3615178Z 2025-12-04T14:25:34.3615252Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3615594Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3615597Z 2025-12-04T14:25:34.3615681Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3615744Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3615805Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T14:25:34.3615878Z Got exit code 1 2025-12-04T14:25:34.3616159Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3616289Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.3616514Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-00b0628c2d207d9d.xml 2025-12-04T14:25:34.3616571Z ============================= test session starts ============================== 2025-12-04T14:25:34.3616684Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3616725Z cachedir: .pytest_cache 2025-12-04T14:25:34.3616882Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3616927Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3616966Z configfile: pytest.ini 2025-12-04T14:25:34.3617126Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3617482Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3617533Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3617876Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3617931Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3617988Z collected 15 items / 4 deselected / 11 selected 2025-12-04T14:25:34.3618040Z stepcurrent: skipping 4 already run items. 2025-12-04T14:25:34.3618083Z Running 11 items in this shard 2025-12-04T14:25:34.3618087Z 2025-12-04T14:25:34.3618491Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:19:19.958000 363004 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 363073 2025-12-04T14:25:34.3618646Z I1204 14:19:19.959000 363004 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 363074 2025-12-04T14:25:34.3618795Z I1204 14:19:19.960000 363004 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 363075 2025-12-04T14:25:34.3618945Z I1204 14:19:19.960000 363004 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 363076 2025-12-04T14:25:34.3619617Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3619685Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3620413Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3620455Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3620952Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3621001Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3621668Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3621710Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3622196Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3622243Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3622911Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3622953Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3623438Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3623483Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3623969Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3624031Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3624163Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3624332Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3624610Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3624758Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3625053Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3625169Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3625440Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3625581Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3625847Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3625985Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3626250Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3626378Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3626645Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3626782Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3627334Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3627446Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3627634Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3628088Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3628196Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3628396Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3628572Z E1204 14:19:27.771000 363073 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3628699Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3628848Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3629148Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3629293Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3629567Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3629681Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3629948Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3630085Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3630388Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3630527Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3630793Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3630921Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3631188Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3631325Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3631870Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1254096896 and is now 2843738112. 2025-12-04T14:25:34.3631976Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3632163Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3632619Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3632739Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3632951Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3633105Z E1204 14:19:27.776000 363076 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3633233Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3633381Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3633681Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3633826Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3634101Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3634213Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3634479Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3634618Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3634886Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3635024Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3635289Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3635414Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3635681Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3635822Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3636367Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3636473Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3636662Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3637112Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3637255Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3637452Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3637606Z E1204 14:19:27.781000 363075 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3637753Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3637905Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3638182Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3638326Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3638599Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3638709Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3638976Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3639113Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3639379Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3639516Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3639781Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3639907Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3640210Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3640350Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3640893Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3640999Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3641185Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3641662Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3641767Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3641964Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3642143Z E1204 14:19:27.839000 363074 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3642182Z FAILED [9.1211s] [ 9%] 2025-12-04T14:25:34.3642184Z 2025-12-04T14:25:34.3642240Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3642422Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3642467Z Traceback (most recent call last): 2025-12-04T14:25:34.3642629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3642672Z self._join_processes(fn) 2025-12-04T14:25:34.3642842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3642896Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3643072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3643117Z raise RuntimeError(error) 2025-12-04T14:25:34.3643195Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3643241Z Traceback (most recent call last): 2025-12-04T14:25:34.3643400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3643442Z getattr(self, test_name)() 2025-12-04T14:25:34.3643598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3643633Z fn() 2025-12-04T14:25:34.3643781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3643821Z method(*args, **kwargs) 2025-12-04T14:25:34.3643970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3644011Z method(*args, **kwargs) 2025-12-04T14:25:34.3644157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3644194Z with policy(): 2025-12-04T14:25:34.3644343Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3644384Z raise RuntimeError(msg) 2025-12-04T14:25:34.3644813Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3644817Z 2025-12-04T14:25:34.3644891Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3645227Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3645255Z 2025-12-04T14:25:34.3645341Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3645343Z 2025-12-04T14:25:34.3645402Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3645446Z Traceback (most recent call last): 2025-12-04T14:25:34.3645606Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3645647Z getattr(self, test_name)() 2025-12-04T14:25:34.3645825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3645859Z fn() 2025-12-04T14:25:34.3646007Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3646047Z method(*args, **kwargs) 2025-12-04T14:25:34.3646196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3646234Z method(*args, **kwargs) 2025-12-04T14:25:34.3646380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3646415Z with policy(): 2025-12-04T14:25:34.3646566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3646606Z raise RuntimeError(msg) 2025-12-04T14:25:34.3647036Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3647040Z 2025-12-04T14:25:34.3647114Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3647449Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3647451Z 2025-12-04T14:25:34.3647537Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3647539Z 2025-12-04T14:25:34.3647595Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3647639Z Traceback (most recent call last): 2025-12-04T14:25:34.3647800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3647843Z getattr(self, test_name)() 2025-12-04T14:25:34.3647998Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3648033Z fn() 2025-12-04T14:25:34.3648179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3648218Z method(*args, **kwargs) 2025-12-04T14:25:34.3648363Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3648404Z method(*args, **kwargs) 2025-12-04T14:25:34.3648549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3648586Z with policy(): 2025-12-04T14:25:34.3648735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3648775Z raise RuntimeError(msg) 2025-12-04T14:25:34.3649219Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1254096896 and is now 2843738112. 2025-12-04T14:25:34.3649233Z 2025-12-04T14:25:34.3649305Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3649636Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3649638Z 2025-12-04T14:25:34.3649741Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3649743Z 2025-12-04T14:25:34.3649744Z 2025-12-04T14:25:34.3649823Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3649910Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3650213Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-00b0628c2d207d9d.xml - 2025-12-04T14:25:34.3650275Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3650616Z FAILED [9.1211s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3650663Z Traceback (most recent call last): 2025-12-04T14:25:34.3650824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3650866Z getattr(self, test_name)() 2025-12-04T14:25:34.3651022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3651057Z fn() 2025-12-04T14:25:34.3651205Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3651244Z method(*args, **kwargs) 2025-12-04T14:25:34.3651390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3651429Z method(*args, **kwargs) 2025-12-04T14:25:34.3651577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3651613Z with policy(): 2025-12-04T14:25:34.3651761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3651803Z raise RuntimeError(msg) 2025-12-04T14:25:34.3652230Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3652232Z 2025-12-04T14:25:34.3652306Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3652639Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3652641Z 2025-12-04T14:25:34.3652726Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3652743Z 2025-12-04T14:25:34.3652801Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3652862Z Traceback (most recent call last): 2025-12-04T14:25:34.3653021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3653061Z getattr(self, test_name)() 2025-12-04T14:25:34.3653216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3653249Z fn() 2025-12-04T14:25:34.3653396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3653434Z method(*args, **kwargs) 2025-12-04T14:25:34.3653610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3653648Z method(*args, **kwargs) 2025-12-04T14:25:34.3653799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3653834Z with policy(): 2025-12-04T14:25:34.3653982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3654021Z raise RuntimeError(msg) 2025-12-04T14:25:34.3654446Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3654448Z 2025-12-04T14:25:34.3654521Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3654854Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3654858Z 2025-12-04T14:25:34.3654942Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3654944Z 2025-12-04T14:25:34.3654999Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3655043Z Traceback (most recent call last): 2025-12-04T14:25:34.3655201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3655242Z getattr(self, test_name)() 2025-12-04T14:25:34.3655398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3655434Z fn() 2025-12-04T14:25:34.3655581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3655621Z method(*args, **kwargs) 2025-12-04T14:25:34.3655767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3655805Z method(*args, **kwargs) 2025-12-04T14:25:34.3655952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3655987Z with policy(): 2025-12-04T14:25:34.3656134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3656174Z raise RuntimeError(msg) 2025-12-04T14:25:34.3656598Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1254096896 and is now 2843738112. 2025-12-04T14:25:34.3656610Z 2025-12-04T14:25:34.3656683Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3657031Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3657033Z 2025-12-04T14:25:34.3657116Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3657180Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3657241Z ======================= 1 failed, 4 deselected in 9.28s ======================== 2025-12-04T14:25:34.3657277Z Got exit code 1 2025-12-04T14:25:34.3657335Z Retrying single test... 2025-12-04T14:25:34.3657559Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3b758ef16287a8c7.xml 2025-12-04T14:25:34.3657617Z ============================= test session starts ============================== 2025-12-04T14:25:34.3657730Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3657769Z cachedir: .pytest_cache 2025-12-04T14:25:34.3657927Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3657971Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3658011Z configfile: pytest.ini 2025-12-04T14:25:34.3658174Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3658533Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3658582Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3658926Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3658981Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3659037Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3659364Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3659406Z Running 1 items in this shard 2025-12-04T14:25:34.3659408Z 2025-12-04T14:25:34.3659810Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:19:31.503000 363406 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 363475 2025-12-04T14:25:34.3659964Z I1204 14:19:31.504000 363406 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 363476 2025-12-04T14:25:34.3660113Z I1204 14:19:31.505000 363406 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 363477 2025-12-04T14:25:34.3660317Z I1204 14:19:31.505000 363406 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 363478 2025-12-04T14:25:34.3660999Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3661076Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3661741Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3661806Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3662464Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3662506Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3663165Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3663205Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3663698Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3663746Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3664231Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3664278Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3664759Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3664806Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3665288Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3665333Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3665464Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3665641Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3665919Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3666063Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3666357Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3666471Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3666740Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3666879Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3667144Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3667280Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3667546Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3667673Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3667944Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3668082Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3668631Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2843738112. 2025-12-04T14:25:34.3668739Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3668927Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3669381Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3669488Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3669690Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3669858Z E1204 14:19:39.290000 363478 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3669996Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3670146Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3670470Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3670641Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3670915Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3671030Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3671297Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3671436Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3671701Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3671836Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3672102Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3672228Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3672495Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3672634Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3673180Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3673288Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3673474Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3673929Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3674033Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3674248Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3674417Z E1204 14:19:39.308000 363476 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3674547Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3674695Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3675003Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3675148Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3675423Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3675535Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3675800Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3675939Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3676203Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3676343Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3676607Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3676733Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3677001Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3677140Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3677693Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3677801Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3677987Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3678441Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3678566Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3678767Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3678921Z E1204 14:19:39.347000 363475 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3679048Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3679218Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3679493Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3679637Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3679913Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3680024Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3680334Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3680472Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3680740Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3680879Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3681144Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3681271Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3681537Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3681675Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3682222Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1256194048 and is now 2843738112. 2025-12-04T14:25:34.3682328Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3682518Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3682982Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3683100Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3683300Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3683485Z E1204 14:19:39.365000 363477 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3683525Z FAILED [9.1211s] [100%] 2025-12-04T14:25:34.3683527Z 2025-12-04T14:25:34.3683584Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3683769Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3683819Z Traceback (most recent call last): 2025-12-04T14:25:34.3683979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3684026Z self._join_processes(fn) 2025-12-04T14:25:34.3684199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3684250Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3684429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3684472Z raise RuntimeError(error) 2025-12-04T14:25:34.3684550Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3684594Z Traceback (most recent call last): 2025-12-04T14:25:34.3684756Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3684797Z getattr(self, test_name)() 2025-12-04T14:25:34.3684955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3684990Z fn() 2025-12-04T14:25:34.3685140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3685179Z method(*args, **kwargs) 2025-12-04T14:25:34.3685330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3685369Z method(*args, **kwargs) 2025-12-04T14:25:34.3685522Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3685558Z with policy(): 2025-12-04T14:25:34.3685711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3685750Z raise RuntimeError(msg) 2025-12-04T14:25:34.3686180Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2843738112. 2025-12-04T14:25:34.3686182Z 2025-12-04T14:25:34.3686257Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3686593Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3686618Z 2025-12-04T14:25:34.3686706Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3686708Z 2025-12-04T14:25:34.3686710Z 2025-12-04T14:25:34.3686785Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3686874Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3687141Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3b758ef16287a8c7.xml - 2025-12-04T14:25:34.3687203Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3687571Z FAILED [9.1211s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3687622Z Traceback (most recent call last): 2025-12-04T14:25:34.3687783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3687826Z getattr(self, test_name)() 2025-12-04T14:25:34.3687983Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3688017Z fn() 2025-12-04T14:25:34.3688164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3688203Z method(*args, **kwargs) 2025-12-04T14:25:34.3688352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3688392Z method(*args, **kwargs) 2025-12-04T14:25:34.3688539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3688578Z with policy(): 2025-12-04T14:25:34.3688726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3688767Z raise RuntimeError(msg) 2025-12-04T14:25:34.3689196Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2843738112. 2025-12-04T14:25:34.3689198Z 2025-12-04T14:25:34.3689272Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3689606Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3689610Z 2025-12-04T14:25:34.3689695Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3689758Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3689819Z ======================= 1 failed, 14 deselected in 9.29s ======================= 2025-12-04T14:25:34.3689855Z Got exit code 1 2025-12-04T14:25:34.3689893Z Retrying single test... 2025-12-04T14:25:34.3690117Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8bdcc308a58056bb.xml 2025-12-04T14:25:34.3690213Z ============================= test session starts ============================== 2025-12-04T14:25:34.3690326Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3690382Z cachedir: .pytest_cache 2025-12-04T14:25:34.3690538Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3690596Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3690637Z configfile: pytest.ini 2025-12-04T14:25:34.3690799Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3691156Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3691238Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3691579Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3691638Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3691692Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3692016Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3692059Z Running 1 items in this shard 2025-12-04T14:25:34.3692061Z 2025-12-04T14:25:34.3692471Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:19:43.176000 363808 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 363877 2025-12-04T14:25:34.3692627Z I1204 14:19:43.177000 363808 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 363878 2025-12-04T14:25:34.3692781Z I1204 14:19:43.177000 363808 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 363879 2025-12-04T14:25:34.3692930Z I1204 14:19:43.178000 363808 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 363880 2025-12-04T14:25:34.3693610Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3693655Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3694323Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3694366Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3695034Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3695096Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3695766Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3695807Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3696320Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3696372Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3696859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3696907Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3697390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3697439Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3697923Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3697971Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3698104Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3698259Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3698542Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3698688Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3698967Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3699082Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3699356Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3699505Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3699783Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3699922Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3700225Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3700383Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3700652Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3700794Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3701348Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3701458Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3701645Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3702099Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3702209Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3702409Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3702569Z E1204 14:19:50.944000 363878 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3702697Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3702847Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3703124Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3703268Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3703542Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3703657Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3703925Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3704090Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3704357Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3704493Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3704781Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3704909Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3705179Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3705317Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3705869Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2843738112. 2025-12-04T14:25:34.3705978Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3706167Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3706620Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3706724Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3706928Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3707085Z E1204 14:19:50.998000 363880 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3707215Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3707366Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3707641Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3707785Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3708059Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3708185Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3708469Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3708609Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3708872Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3709030Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3709296Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3709423Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3709690Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3709827Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3710411Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3710519Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3710708Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3711163Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3711268Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3711471Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3711625Z E1204 14:19:50.998000 363877 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3711752Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3711901Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3712179Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3712323Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3712612Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3712737Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3713002Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3713139Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3713428Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3713567Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3713833Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3713961Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3714230Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3714371Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3714923Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3715031Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3715220Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3715676Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3715782Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3715982Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3716140Z E1204 14:19:51.010000 363879 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3716179Z FAILED [8.9193s] [100%] 2025-12-04T14:25:34.3716181Z 2025-12-04T14:25:34.3716234Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3716421Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3716466Z Traceback (most recent call last): 2025-12-04T14:25:34.3716640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3716693Z self._join_processes(fn) 2025-12-04T14:25:34.3716866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3716917Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3717095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3717138Z raise RuntimeError(error) 2025-12-04T14:25:34.3717218Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3717262Z Traceback (most recent call last): 2025-12-04T14:25:34.3717448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3717490Z getattr(self, test_name)() 2025-12-04T14:25:34.3717649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3717683Z fn() 2025-12-04T14:25:34.3717835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3717877Z method(*args, **kwargs) 2025-12-04T14:25:34.3718027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3718066Z method(*args, **kwargs) 2025-12-04T14:25:34.3718215Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3718250Z with policy(): 2025-12-04T14:25:34.3718405Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3718447Z raise RuntimeError(msg) 2025-12-04T14:25:34.3718880Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3718884Z 2025-12-04T14:25:34.3718959Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3719294Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3719297Z 2025-12-04T14:25:34.3719384Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3719386Z 2025-12-04T14:25:34.3719388Z 2025-12-04T14:25:34.3719462Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3719550Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3719817Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8bdcc308a58056bb.xml - 2025-12-04T14:25:34.3719876Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3720253Z FAILED [8.9193s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3720300Z Traceback (most recent call last): 2025-12-04T14:25:34.3720463Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3720521Z getattr(self, test_name)() 2025-12-04T14:25:34.3720694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3720727Z fn() 2025-12-04T14:25:34.3720876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3720914Z method(*args, **kwargs) 2025-12-04T14:25:34.3721064Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3721102Z method(*args, **kwargs) 2025-12-04T14:25:34.3721276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3721313Z with policy(): 2025-12-04T14:25:34.3721462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3721503Z raise RuntimeError(msg) 2025-12-04T14:25:34.3721932Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3721935Z 2025-12-04T14:25:34.3722007Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3722342Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3722344Z 2025-12-04T14:25:34.3722428Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3722492Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3722553Z ======================= 1 failed, 14 deselected in 9.06s ======================= 2025-12-04T14:25:34.3722592Z Got exit code 1 2025-12-04T14:25:34.3722875Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3723000Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.3723224Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-217f048b1aad6fb7.xml 2025-12-04T14:25:34.3723283Z ============================= test session starts ============================== 2025-12-04T14:25:34.3723395Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3723436Z cachedir: .pytest_cache 2025-12-04T14:25:34.3723594Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3723637Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3723678Z configfile: pytest.ini 2025-12-04T14:25:34.3723838Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3724199Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3724247Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3724594Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3724676Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3724730Z collected 15 items / 5 deselected / 10 selected 2025-12-04T14:25:34.3724781Z stepcurrent: skipping 5 already run items. 2025-12-04T14:25:34.3724824Z Running 10 items in this shard 2025-12-04T14:25:34.3724826Z 2025-12-04T14:25:34.3725227Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:19:54.902000 364210 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 364279 2025-12-04T14:25:34.3725400Z I1204 14:19:54.903000 364210 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 364280 2025-12-04T14:25:34.3725553Z I1204 14:19:54.903000 364210 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 364281 2025-12-04T14:25:34.3725702Z I1204 14:19:54.904000 364210 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 364282 2025-12-04T14:25:34.3726379Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3726421Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3726920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3726968Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3727639Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3727682Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3728169Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3728216Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3728884Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3728925Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3729590Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3729650Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3730160Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3730236Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3730719Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3730765Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3730897Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3731050Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3731328Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3731475Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3731753Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3731867Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3732135Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3732275Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3732543Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3732681Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3732947Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3733075Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3733348Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3733486Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3734062Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3734171Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3734384Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3734838Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3734945Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3735146Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3735301Z E1204 14:20:02.678000 364282 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3735431Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3735581Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3735859Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3736005Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3736279Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3736394Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3736664Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3736807Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3737075Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3737213Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3737478Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3737606Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3737875Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3738034Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3738578Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3738712Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3738900Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3739355Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3739459Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3739660Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3739814Z E1204 14:20:02.691000 364279 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3739941Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3740092Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3740400Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3740543Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3740821Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3740933Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3741203Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3741341Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3741606Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3741744Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3742011Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3742153Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3742432Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3742571Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3743140Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3743247Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3743434Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3743882Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3743986Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3744186Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3744340Z E1204 14:20:02.695000 364280 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3744467Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3744616Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3744897Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3745042Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3745318Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3745431Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3745700Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3745837Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3746109Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3746244Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3746525Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3746666Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3746934Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3747076Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3747642Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3747752Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3747939Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3748393Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3748500Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3748701Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3748859Z E1204 14:20:02.697000 364281 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3748897Z FAILED [9.0192s] [ 10%] 2025-12-04T14:25:34.3748899Z 2025-12-04T14:25:34.3748958Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3749138Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3749186Z Traceback (most recent call last): 2025-12-04T14:25:34.3749348Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3749395Z self._join_processes(fn) 2025-12-04T14:25:34.3749569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3749623Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3749798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3749845Z raise RuntimeError(error) 2025-12-04T14:25:34.3749923Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3749971Z Traceback (most recent call last): 2025-12-04T14:25:34.3750129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3750213Z getattr(self, test_name)() 2025-12-04T14:25:34.3750371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3750427Z fn() 2025-12-04T14:25:34.3750576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3750632Z method(*args, **kwargs) 2025-12-04T14:25:34.3750782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3750821Z method(*args, **kwargs) 2025-12-04T14:25:34.3750970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3751006Z with policy(): 2025-12-04T14:25:34.3751158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3751227Z raise RuntimeError(msg) 2025-12-04T14:25:34.3751656Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3751663Z 2025-12-04T14:25:34.3751736Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3752072Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3752074Z 2025-12-04T14:25:34.3752159Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3752161Z 2025-12-04T14:25:34.3752224Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3752269Z Traceback (most recent call last): 2025-12-04T14:25:34.3752430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3752471Z getattr(self, test_name)() 2025-12-04T14:25:34.3752632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3752665Z fn() 2025-12-04T14:25:34.3752815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3752856Z method(*args, **kwargs) 2025-12-04T14:25:34.3753006Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3753047Z method(*args, **kwargs) 2025-12-04T14:25:34.3753198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3753237Z with policy(): 2025-12-04T14:25:34.3753386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3753433Z raise RuntimeError(msg) 2025-12-04T14:25:34.3753863Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3753865Z 2025-12-04T14:25:34.3753939Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3754276Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3754278Z 2025-12-04T14:25:34.3754364Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3754377Z 2025-12-04T14:25:34.3754435Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3754491Z Traceback (most recent call last): 2025-12-04T14:25:34.3754650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3754694Z getattr(self, test_name)() 2025-12-04T14:25:34.3754851Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3754885Z fn() 2025-12-04T14:25:34.3755037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3755075Z method(*args, **kwargs) 2025-12-04T14:25:34.3755250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3755288Z method(*args, **kwargs) 2025-12-04T14:25:34.3755441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3755477Z with policy(): 2025-12-04T14:25:34.3755629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3755669Z raise RuntimeError(msg) 2025-12-04T14:25:34.3756097Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3756099Z 2025-12-04T14:25:34.3756173Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3756504Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3756508Z 2025-12-04T14:25:34.3756592Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3756594Z 2025-12-04T14:25:34.3756598Z 2025-12-04T14:25:34.3756673Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3756762Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3757029Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-217f048b1aad6fb7.xml - 2025-12-04T14:25:34.3757093Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3757435Z FAILED [9.0192s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3757485Z Traceback (most recent call last): 2025-12-04T14:25:34.3757646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3757690Z getattr(self, test_name)() 2025-12-04T14:25:34.3757850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3757886Z fn() 2025-12-04T14:25:34.3758036Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3758078Z method(*args, **kwargs) 2025-12-04T14:25:34.3758228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3758282Z method(*args, **kwargs) 2025-12-04T14:25:34.3758441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3758480Z with policy(): 2025-12-04T14:25:34.3758630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3758673Z raise RuntimeError(msg) 2025-12-04T14:25:34.3759121Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3759126Z 2025-12-04T14:25:34.3759199Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3759538Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3759541Z 2025-12-04T14:25:34.3759626Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3759628Z 2025-12-04T14:25:34.3759687Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3759733Z Traceback (most recent call last): 2025-12-04T14:25:34.3759894Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3759936Z getattr(self, test_name)() 2025-12-04T14:25:34.3760098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3760133Z fn() 2025-12-04T14:25:34.3760321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3760364Z method(*args, **kwargs) 2025-12-04T14:25:34.3760513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3760552Z method(*args, **kwargs) 2025-12-04T14:25:34.3760701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3760738Z with policy(): 2025-12-04T14:25:34.3760888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3760929Z raise RuntimeError(msg) 2025-12-04T14:25:34.3761356Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3761360Z 2025-12-04T14:25:34.3761434Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3761764Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3761766Z 2025-12-04T14:25:34.3761852Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3761854Z 2025-12-04T14:25:34.3761911Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3761960Z Traceback (most recent call last): 2025-12-04T14:25:34.3762121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3762183Z getattr(self, test_name)() 2025-12-04T14:25:34.3762340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3762391Z fn() 2025-12-04T14:25:34.3762538Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3762579Z method(*args, **kwargs) 2025-12-04T14:25:34.3762726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3762767Z method(*args, **kwargs) 2025-12-04T14:25:34.3762915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3762982Z with policy(): 2025-12-04T14:25:34.3763133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3763178Z raise RuntimeError(msg) 2025-12-04T14:25:34.3763604Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3763609Z 2025-12-04T14:25:34.3763681Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3764017Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3764020Z 2025-12-04T14:25:34.3764104Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3764171Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3764235Z ======================= 1 failed, 5 deselected in 9.18s ======================== 2025-12-04T14:25:34.3764274Z Got exit code 1 2025-12-04T14:25:34.3764314Z Retrying single test... 2025-12-04T14:25:34.3764542Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3f8d05faf77fe3db.xml 2025-12-04T14:25:34.3764598Z ============================= test session starts ============================== 2025-12-04T14:25:34.3764713Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3764755Z cachedir: .pytest_cache 2025-12-04T14:25:34.3764915Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3764959Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3765002Z configfile: pytest.ini 2025-12-04T14:25:34.3765162Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3765520Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3765570Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3765913Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3765971Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3766026Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3766363Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3766417Z Running 1 items in this shard 2025-12-04T14:25:34.3766419Z 2025-12-04T14:25:34.3766825Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:20:06.332000 364612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 364681 2025-12-04T14:25:34.3766978Z I1204 14:20:06.333000 364612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 364682 2025-12-04T14:25:34.3767148Z I1204 14:20:06.333000 364612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 364683 2025-12-04T14:25:34.3767297Z I1204 14:20:06.334000 364612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 364684 2025-12-04T14:25:34.3767976Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3768020Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3768683Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3768729Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3769388Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3769433Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3770095Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3770137Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3770668Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3770717Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3771203Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3771286Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3771770Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3771841Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3772325Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3772373Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3772507Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3772664Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3772948Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3773095Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3773375Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3773491Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3773762Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3773902Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3774173Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3774314Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3774588Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3774717Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3774991Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3775134Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3775687Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3775821Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3776008Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3776484Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3776596Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3776797Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3776954Z E1204 14:20:14.172000 364681 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3777082Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3777235Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3777511Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3777657Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3777931Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3778045Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3778311Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3778451Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3778718Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3778854Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3779120Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3779246Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3779517Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3779668Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3780256Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3780364Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3780577Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3781029Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3781136Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3781337Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3781491Z E1204 14:20:14.188000 364683 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3781623Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3781772Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3782050Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3782195Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3782469Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3782583Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3782850Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3782990Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3783253Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3783391Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3783656Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3783783Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3784067Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3784219Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3784787Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 803209216 and is now 2843738112. 2025-12-04T14:25:34.3784892Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3785083Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3785532Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3785640Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3785843Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3785997Z E1204 14:20:14.225000 364684 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3786128Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3786280Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3786555Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3786698Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3786975Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3787087Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3787355Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3787495Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3787760Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3787902Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3788168Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3788326Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3788594Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3788734Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3789297Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3789404Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3789591Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3790041Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3790150Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3790391Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3790552Z E1204 14:20:14.247000 364682 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3790592Z FAILED [9.0205s] [100%] 2025-12-04T14:25:34.3790594Z 2025-12-04T14:25:34.3790651Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3790832Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3790878Z Traceback (most recent call last): 2025-12-04T14:25:34.3791042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3791087Z self._join_processes(fn) 2025-12-04T14:25:34.3791262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3791315Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3791493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3791535Z raise RuntimeError(error) 2025-12-04T14:25:34.3791616Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3791660Z Traceback (most recent call last): 2025-12-04T14:25:34.3791822Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3791862Z getattr(self, test_name)() 2025-12-04T14:25:34.3792021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3792054Z fn() 2025-12-04T14:25:34.3792204Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3792262Z method(*args, **kwargs) 2025-12-04T14:25:34.3792426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3792466Z method(*args, **kwargs) 2025-12-04T14:25:34.3792619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3792655Z with policy(): 2025-12-04T14:25:34.3792807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3792847Z raise RuntimeError(msg) 2025-12-04T14:25:34.3793306Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3793310Z 2025-12-04T14:25:34.3793386Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3793722Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3793724Z 2025-12-04T14:25:34.3793813Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3793815Z 2025-12-04T14:25:34.3793817Z 2025-12-04T14:25:34.3793891Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3793981Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3794249Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3f8d05faf77fe3db.xml - 2025-12-04T14:25:34.3794311Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3794654Z FAILED [9.0205s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3794701Z Traceback (most recent call last): 2025-12-04T14:25:34.3794862Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3794906Z getattr(self, test_name)() 2025-12-04T14:25:34.3795066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3795101Z fn() 2025-12-04T14:25:34.3795251Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3795293Z method(*args, **kwargs) 2025-12-04T14:25:34.3795442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3795483Z method(*args, **kwargs) 2025-12-04T14:25:34.3795632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3795669Z with policy(): 2025-12-04T14:25:34.3795819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3795858Z raise RuntimeError(msg) 2025-12-04T14:25:34.3796285Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3796310Z 2025-12-04T14:25:34.3796383Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3796715Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3796718Z 2025-12-04T14:25:34.3796802Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3796869Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3796951Z ======================= 1 failed, 14 deselected in 9.18s ======================= 2025-12-04T14:25:34.3796989Z Got exit code 1 2025-12-04T14:25:34.3797029Z Retrying single test... 2025-12-04T14:25:34.3797256Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-33383146472fd2fd.xml 2025-12-04T14:25:34.3797313Z ============================= test session starts ============================== 2025-12-04T14:25:34.3797426Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3797467Z cachedir: .pytest_cache 2025-12-04T14:25:34.3797624Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3797670Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3797711Z configfile: pytest.ini 2025-12-04T14:25:34.3797873Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3798228Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3798279Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3798619Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3798677Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3798731Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3799153Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3799234Z Running 1 items in this shard 2025-12-04T14:25:34.3799237Z 2025-12-04T14:25:34.3799657Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:20:18.031000 365014 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 365083 2025-12-04T14:25:34.3799839Z I1204 14:20:18.032000 365014 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 365084 2025-12-04T14:25:34.3800009Z I1204 14:20:18.032000 365014 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 365085 2025-12-04T14:25:34.3800226Z I1204 14:20:18.033000 365014 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 365086 2025-12-04T14:25:34.3800932Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3803434Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3804163Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3804207Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3804874Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3804919Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3805416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3805464Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3806132Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3806173Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3806666Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3806713Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3807195Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3807241Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3807730Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3807788Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3807923Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3808091Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3808373Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3808518Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3808819Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3808934Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3809206Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3809345Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3809612Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3809752Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3810019Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3810147Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3810459Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3810600Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3811152Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3811268Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3811454Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3811906Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3812016Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3812217Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3812404Z E1204 14:20:25.895000 365083 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3812533Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3812684Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3812959Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3813137Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3813412Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3813528Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3813793Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3813936Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3814205Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3814341Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3814609Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3814734Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3815003Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3815143Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3815690Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3815799Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3815986Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3816442Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3816560Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3816769Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3816923Z E1204 14:20:25.916000 365085 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3817050Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3817199Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3817495Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3817642Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3817916Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3818028Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3818295Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3818433Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3818697Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3818835Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3819099Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3819226Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3819494Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3819630Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3820206Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1264582656 and is now 2843738112. 2025-12-04T14:25:34.3820311Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3820502Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3820951Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3821085Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3821285Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3821438Z E1204 14:20:25.916000 365086 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3821595Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3821744Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3822020Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3822164Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3822438Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3822550Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3822827Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3822967Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3823232Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3823369Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3823632Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3823760Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3824027Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3824167Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3824709Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3824817Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3825005Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3825477Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3825582Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3825780Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3825954Z E1204 14:20:25.930000 365084 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3825993Z FAILED [9.2208s] [100%] 2025-12-04T14:25:34.3825997Z 2025-12-04T14:25:34.3826053Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3826232Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3826279Z Traceback (most recent call last): 2025-12-04T14:25:34.3826439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3826483Z self._join_processes(fn) 2025-12-04T14:25:34.3826655Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3826707Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3826884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3826926Z raise RuntimeError(error) 2025-12-04T14:25:34.3827006Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3827051Z Traceback (most recent call last): 2025-12-04T14:25:34.3827210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3827251Z getattr(self, test_name)() 2025-12-04T14:25:34.3827408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3827442Z fn() 2025-12-04T14:25:34.3827591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3827630Z method(*args, **kwargs) 2025-12-04T14:25:34.3827781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3827819Z method(*args, **kwargs) 2025-12-04T14:25:34.3827967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3828004Z with policy(): 2025-12-04T14:25:34.3828154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3828193Z raise RuntimeError(msg) 2025-12-04T14:25:34.3828623Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3828625Z 2025-12-04T14:25:34.3828701Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3829038Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3829061Z 2025-12-04T14:25:34.3829149Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3829151Z 2025-12-04T14:25:34.3829209Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3829254Z Traceback (most recent call last): 2025-12-04T14:25:34.3829413Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3829455Z getattr(self, test_name)() 2025-12-04T14:25:34.3829612Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3829671Z fn() 2025-12-04T14:25:34.3829819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3829860Z method(*args, **kwargs) 2025-12-04T14:25:34.3830008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3830048Z method(*args, **kwargs) 2025-12-04T14:25:34.3830240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3830277Z with policy(): 2025-12-04T14:25:34.3830426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3830465Z raise RuntimeError(msg) 2025-12-04T14:25:34.3830892Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3830896Z 2025-12-04T14:25:34.3830969Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3831303Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3831307Z 2025-12-04T14:25:34.3831393Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3831395Z 2025-12-04T14:25:34.3831397Z 2025-12-04T14:25:34.3831474Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3831560Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3831828Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-33383146472fd2fd.xml - 2025-12-04T14:25:34.3831888Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3832233Z FAILED [9.2208s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3832277Z Traceback (most recent call last): 2025-12-04T14:25:34.3832440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3832481Z getattr(self, test_name)() 2025-12-04T14:25:34.3832640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3832674Z fn() 2025-12-04T14:25:34.3832823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3832878Z method(*args, **kwargs) 2025-12-04T14:25:34.3833042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3833080Z method(*args, **kwargs) 2025-12-04T14:25:34.3833230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3833265Z with policy(): 2025-12-04T14:25:34.3833416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3833455Z raise RuntimeError(msg) 2025-12-04T14:25:34.3833916Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T14:25:34.3833920Z 2025-12-04T14:25:34.3833994Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3834327Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3834329Z 2025-12-04T14:25:34.3834414Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3834416Z 2025-12-04T14:25:34.3834473Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.3834517Z Traceback (most recent call last): 2025-12-04T14:25:34.3834678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3834721Z getattr(self, test_name)() 2025-12-04T14:25:34.3834876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3834911Z fn() 2025-12-04T14:25:34.3835058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3835097Z method(*args, **kwargs) 2025-12-04T14:25:34.3835244Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3835283Z method(*args, **kwargs) 2025-12-04T14:25:34.3835428Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3835464Z with policy(): 2025-12-04T14:25:34.3835616Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3835659Z raise RuntimeError(msg) 2025-12-04T14:25:34.3836087Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T14:25:34.3836090Z 2025-12-04T14:25:34.3836161Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3836492Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3836494Z 2025-12-04T14:25:34.3836580Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3836643Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3836720Z ======================= 1 failed, 14 deselected in 9.38s ======================= 2025-12-04T14:25:34.3836768Z Got exit code 1 2025-12-04T14:25:34.3837047Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3837173Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.3837396Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-22d0aabb4c53013f.xml 2025-12-04T14:25:34.3837476Z ============================= test session starts ============================== 2025-12-04T14:25:34.3837589Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3837630Z cachedir: .pytest_cache 2025-12-04T14:25:34.3837788Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3837833Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3837873Z configfile: pytest.ini 2025-12-04T14:25:34.3838034Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3838392Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3838440Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3838782Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3838838Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3838891Z collected 15 items / 6 deselected / 9 selected 2025-12-04T14:25:34.3838941Z stepcurrent: skipping 6 already run items. 2025-12-04T14:25:34.3838983Z Running 9 items in this shard 2025-12-04T14:25:34.3838986Z 2025-12-04T14:25:34.3839389Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:20:29.750000 365416 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 365485 2025-12-04T14:25:34.3839547Z I1204 14:20:29.751000 365416 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 365486 2025-12-04T14:25:34.3839697Z I1204 14:20:29.752000 365416 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 365487 2025-12-04T14:25:34.3839846Z I1204 14:20:29.752000 365416 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 365488 2025-12-04T14:25:34.3840570Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3840612Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3841283Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3841357Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3842045Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3842087Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3842752Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3842797Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3843292Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3843340Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3843828Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3843874Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3844360Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3844404Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3844885Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3844931Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3845065Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3845217Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3845498Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3845644Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3845947Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3846062Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3846328Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3846488Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3846753Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3846893Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3847157Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3847284Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3847552Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3847690Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3848248Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T14:25:34.3848356Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3848545Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3849002Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3849108Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3849308Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3849462Z E1204 14:20:37.040000 365488 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3849591Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3849741Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3850019Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3850224Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3850498Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3850611Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3850903Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3851041Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3851306Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3851443Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3851706Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3851834Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3852102Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3852243Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3852789Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3852896Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3853083Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3853534Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3853642Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3853841Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3853999Z E1204 14:20:37.048000 365485 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3854126Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3854290Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3854582Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3854724Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3854997Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3855131Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3855397Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3855534Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3855799Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3855935Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3856199Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3856325Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3856592Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3856730Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3857274Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3857379Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3857567Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3858018Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3858122Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3858323Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3858477Z E1204 14:20:37.065000 365487 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3858626Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3858776Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3859052Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3859196Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3859493Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3859607Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3859874Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3860011Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3860398Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3860536Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3860802Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3860929Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3861196Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3861333Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3861880Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3861988Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3862172Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3862625Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3862728Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3862928Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3863120Z E1204 14:20:37.141000 365486 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3863158Z FAILED [8.5191s] [ 11%] 2025-12-04T14:25:34.3863160Z 2025-12-04T14:25:34.3863216Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3863395Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3863440Z Traceback (most recent call last): 2025-12-04T14:25:34.3863626Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3863671Z self._join_processes(fn) 2025-12-04T14:25:34.3863843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3863898Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3864074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3864118Z raise RuntimeError(error) 2025-12-04T14:25:34.3864196Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3864240Z Traceback (most recent call last): 2025-12-04T14:25:34.3864399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3864444Z getattr(self, test_name)() 2025-12-04T14:25:34.3864602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3864635Z fn() 2025-12-04T14:25:34.3864786Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3864828Z method(*args, **kwargs) 2025-12-04T14:25:34.3864977Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3865017Z method(*args, **kwargs) 2025-12-04T14:25:34.3865164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3865200Z with policy(): 2025-12-04T14:25:34.3865351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3865392Z raise RuntimeError(msg) 2025-12-04T14:25:34.3865821Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3865824Z 2025-12-04T14:25:34.3865898Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3866238Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3866240Z 2025-12-04T14:25:34.3866326Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3866331Z 2025-12-04T14:25:34.3866333Z 2025-12-04T14:25:34.3866408Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3866496Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3866762Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-22d0aabb4c53013f.xml - 2025-12-04T14:25:34.3866844Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3867186Z FAILED [8.5191s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3867233Z Traceback (most recent call last): 2025-12-04T14:25:34.3867394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3867456Z getattr(self, test_name)() 2025-12-04T14:25:34.3867615Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3867650Z fn() 2025-12-04T14:25:34.3867799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3867839Z method(*args, **kwargs) 2025-12-04T14:25:34.3867986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3868026Z method(*args, **kwargs) 2025-12-04T14:25:34.3868173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3868209Z with policy(): 2025-12-04T14:25:34.3868357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3868398Z raise RuntimeError(msg) 2025-12-04T14:25:34.3868827Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3868832Z 2025-12-04T14:25:34.3868905Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3869236Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3869239Z 2025-12-04T14:25:34.3869322Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3869386Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3869446Z ======================= 1 failed, 6 deselected in 8.68s ======================== 2025-12-04T14:25:34.3869483Z Got exit code 1 2025-12-04T14:25:34.3869523Z Retrying single test... 2025-12-04T14:25:34.3869745Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-30018f2f07ccfc17.xml 2025-12-04T14:25:34.3869803Z ============================= test session starts ============================== 2025-12-04T14:25:34.3869915Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3869955Z cachedir: .pytest_cache 2025-12-04T14:25:34.3870110Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3870154Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3870228Z configfile: pytest.ini 2025-12-04T14:25:34.3870392Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3870750Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3870829Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3871171Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3871226Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3871281Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3871633Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3871677Z Running 1 items in this shard 2025-12-04T14:25:34.3871679Z 2025-12-04T14:25:34.3872081Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:20:40.798000 365818 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 365887 2025-12-04T14:25:34.3872233Z I1204 14:20:40.799000 365818 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 365888 2025-12-04T14:25:34.3872382Z I1204 14:20:40.799000 365818 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 365889 2025-12-04T14:25:34.3872529Z I1204 14:20:40.800000 365818 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 365890 2025-12-04T14:25:34.3873200Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3873244Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3873735Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3873784Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3874445Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3874488Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3875148Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3875198Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3875867Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3875906Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3876421Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3876471Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3876950Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3876996Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3877475Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3877521Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3877654Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3877810Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3878091Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3878237Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3878514Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3878628Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3878899Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3879037Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3879304Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3879442Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3879707Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3879854Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3880122Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3880291Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3880872Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3880982Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3881168Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3881621Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3881727Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3881928Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3882084Z E1204 14:20:48.708000 365887 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3882213Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3882365Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3882642Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3882786Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3883062Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3883175Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3883443Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3883580Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3883850Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3884000Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3884277Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3884404Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3884672Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3884830Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3885372Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3885478Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3885664Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3886117Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3886223Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3886423Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3886577Z E1204 14:20:48.740000 365888 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3886705Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3886856Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3887135Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3887281Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3887554Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3887669Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3887938Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3888078Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3888354Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3888505Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3888769Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3888895Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3889193Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3889333Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3889876Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 803209216 and is now 2820669440. 2025-12-04T14:25:34.3889980Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3890197Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3890652Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3890758Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3890958Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3891112Z E1204 14:20:48.747000 365890 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3891242Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3891391Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3891672Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3891816Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3892090Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3892203Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3892470Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3892642Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3892905Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3893042Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3893337Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3893465Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3893731Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3893873Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3894420Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3894524Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3894711Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3895162Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3895266Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3895465Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3895619Z E1204 14:20:48.771000 365889 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3895658Z FAILED [9.3220s] [100%] 2025-12-04T14:25:34.3895661Z 2025-12-04T14:25:34.3895715Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3895896Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3895940Z Traceback (most recent call last): 2025-12-04T14:25:34.3896102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3896145Z self._join_processes(fn) 2025-12-04T14:25:34.3896319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3896372Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3896549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3896604Z raise RuntimeError(error) 2025-12-04T14:25:34.3896697Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3896740Z Traceback (most recent call last): 2025-12-04T14:25:34.3896898Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3896939Z getattr(self, test_name)() 2025-12-04T14:25:34.3897096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3897130Z fn() 2025-12-04T14:25:34.3897278Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3897337Z method(*args, **kwargs) 2025-12-04T14:25:34.3897487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3897526Z method(*args, **kwargs) 2025-12-04T14:25:34.3897674Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3897710Z with policy(): 2025-12-04T14:25:34.3897860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3897899Z raise RuntimeError(msg) 2025-12-04T14:25:34.3898330Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3898333Z 2025-12-04T14:25:34.3898407Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3898742Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3898746Z 2025-12-04T14:25:34.3898832Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3898834Z 2025-12-04T14:25:34.3898835Z 2025-12-04T14:25:34.3898907Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3898993Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3899259Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-30018f2f07ccfc17.xml - 2025-12-04T14:25:34.3899319Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3899662Z FAILED [9.3220s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3899710Z Traceback (most recent call last): 2025-12-04T14:25:34.3899870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3899911Z getattr(self, test_name)() 2025-12-04T14:25:34.3900069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3900103Z fn() 2025-12-04T14:25:34.3900296Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3900336Z method(*args, **kwargs) 2025-12-04T14:25:34.3900483Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3900540Z method(*args, **kwargs) 2025-12-04T14:25:34.3900701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3900736Z with policy(): 2025-12-04T14:25:34.3900887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3900925Z raise RuntimeError(msg) 2025-12-04T14:25:34.3901378Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3901381Z 2025-12-04T14:25:34.3901454Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3901790Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3901793Z 2025-12-04T14:25:34.3901877Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3901940Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3902000Z ======================= 1 failed, 14 deselected in 9.48s ======================= 2025-12-04T14:25:34.3902036Z Got exit code 1 2025-12-04T14:25:34.3902074Z Retrying single test... 2025-12-04T14:25:34.3902299Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-e641c61c9b4e5a27.xml 2025-12-04T14:25:34.3902354Z ============================= test session starts ============================== 2025-12-04T14:25:34.3902468Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3902508Z cachedir: .pytest_cache 2025-12-04T14:25:34.3902666Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3902710Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3902749Z configfile: pytest.ini 2025-12-04T14:25:34.3902912Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3903270Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3903319Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3903661Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3903718Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3903773Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3904097Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3904139Z Running 1 items in this shard 2025-12-04T14:25:34.3904142Z 2025-12-04T14:25:34.3904545Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:20:52.877000 366220 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 366289 2025-12-04T14:25:34.3904720Z I1204 14:20:52.878000 366220 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 366290 2025-12-04T14:25:34.3904870Z I1204 14:20:52.878000 366220 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 366291 2025-12-04T14:25:34.3905019Z I1204 14:20:52.879000 366220 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 366292 2025-12-04T14:25:34.3905723Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3905768Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3906428Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3906470Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3907131Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3907172Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3907662Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3907709Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3908195Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3908244Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3908722Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3908767Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3909430Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3909505Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3909991Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3910036Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3910225Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3910378Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3910659Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3910802Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3911077Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3911191Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3911458Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3911597Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3911863Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3912001Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3912266Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3912394Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3912663Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3912802Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3913350Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3913457Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3913658Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3914124Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3914230Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3914448Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3914604Z E1204 14:21:00.150000 366289 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3914732Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3914883Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3915164Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3915308Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3915583Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3915696Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3915963Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3916100Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3916364Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3916501Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3916765Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3916893Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3917159Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3917298Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3917843Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1254096896 and is now 2820669440. 2025-12-04T14:25:34.3917969Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3918154Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3918605Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3918727Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3918929Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3919085Z E1204 14:21:00.157000 366292 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3919213Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3919362Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3919637Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3919785Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3920058Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3920213Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3920478Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3920615Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3920882Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3921018Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3921288Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3921413Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3921682Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3921823Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3922368Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3922505Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3922691Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3923171Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3923277Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3923478Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3923631Z E1204 14:21:00.217000 366290 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3923758Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3923908Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3924186Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3924330Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3924605Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3924718Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3924985Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3925125Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3925389Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3925528Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3925792Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3925918Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3926188Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3926325Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3926898Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3927003Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3927214Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3927666Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3927773Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3927972Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3928126Z E1204 14:21:00.260000 366291 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3928164Z FAILED [8.5205s] [100%] 2025-12-04T14:25:34.3928166Z 2025-12-04T14:25:34.3928220Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3928400Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.3928445Z Traceback (most recent call last): 2025-12-04T14:25:34.3928608Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3928651Z self._join_processes(fn) 2025-12-04T14:25:34.3928821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3928872Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3929047Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3929088Z raise RuntimeError(error) 2025-12-04T14:25:34.3929168Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3929212Z Traceback (most recent call last): 2025-12-04T14:25:34.3929372Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3929414Z getattr(self, test_name)() 2025-12-04T14:25:34.3929570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3929603Z fn() 2025-12-04T14:25:34.3929752Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3929790Z method(*args, **kwargs) 2025-12-04T14:25:34.3929939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3929978Z method(*args, **kwargs) 2025-12-04T14:25:34.3930128Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3930163Z with policy(): 2025-12-04T14:25:34.3930344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3930413Z raise RuntimeError(msg) 2025-12-04T14:25:34.3930842Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3930844Z 2025-12-04T14:25:34.3930920Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3931282Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3931284Z 2025-12-04T14:25:34.3931370Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3931373Z 2025-12-04T14:25:34.3931431Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3931475Z Traceback (most recent call last): 2025-12-04T14:25:34.3931635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3931676Z getattr(self, test_name)() 2025-12-04T14:25:34.3931832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3931866Z fn() 2025-12-04T14:25:34.3932014Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3932054Z method(*args, **kwargs) 2025-12-04T14:25:34.3932201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3932242Z method(*args, **kwargs) 2025-12-04T14:25:34.3932388Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3932426Z with policy(): 2025-12-04T14:25:34.3932574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3932614Z raise RuntimeError(msg) 2025-12-04T14:25:34.3933040Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1254096896 and is now 2820669440. 2025-12-04T14:25:34.3933044Z 2025-12-04T14:25:34.3933117Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3933450Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3933454Z 2025-12-04T14:25:34.3933538Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3933540Z 2025-12-04T14:25:34.3933541Z 2025-12-04T14:25:34.3933616Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3933701Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3933971Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-e641c61c9b4e5a27.xml - 2025-12-04T14:25:34.3934031Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3934373Z FAILED [8.5205s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3934441Z Traceback (most recent call last): 2025-12-04T14:25:34.3934601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3934644Z getattr(self, test_name)() 2025-12-04T14:25:34.3934799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3934833Z fn() 2025-12-04T14:25:34.3935001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3935042Z method(*args, **kwargs) 2025-12-04T14:25:34.3935189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3935229Z method(*args, **kwargs) 2025-12-04T14:25:34.3935376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3935413Z with policy(): 2025-12-04T14:25:34.3935560Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3935600Z raise RuntimeError(msg) 2025-12-04T14:25:34.3936026Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3936028Z 2025-12-04T14:25:34.3936100Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3936435Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3936439Z 2025-12-04T14:25:34.3936522Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3936524Z 2025-12-04T14:25:34.3936581Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.3936624Z Traceback (most recent call last): 2025-12-04T14:25:34.3936784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3936826Z getattr(self, test_name)() 2025-12-04T14:25:34.3936984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3937018Z fn() 2025-12-04T14:25:34.3937168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3937207Z method(*args, **kwargs) 2025-12-04T14:25:34.3937355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3937392Z method(*args, **kwargs) 2025-12-04T14:25:34.3937539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3937574Z with policy(): 2025-12-04T14:25:34.3937722Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3937761Z raise RuntimeError(msg) 2025-12-04T14:25:34.3938188Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1254096896 and is now 2820669440. 2025-12-04T14:25:34.3938213Z 2025-12-04T14:25:34.3938285Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3938615Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3938618Z 2025-12-04T14:25:34.3938701Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3938763Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3938852Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T14:25:34.3938888Z Got exit code 1 2025-12-04T14:25:34.3939170Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.3939297Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.3939521Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a608fb4cbd6a22cb.xml 2025-12-04T14:25:34.3939578Z ============================= test session starts ============================== 2025-12-04T14:25:34.3939691Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3939731Z cachedir: .pytest_cache 2025-12-04T14:25:34.3939891Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3939937Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3939976Z configfile: pytest.ini 2025-12-04T14:25:34.3940135Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3940534Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3940582Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3940926Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3940982Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3941035Z collected 15 items / 7 deselected / 8 selected 2025-12-04T14:25:34.3941087Z stepcurrent: skipping 7 already run items. 2025-12-04T14:25:34.3941130Z Running 8 items in this shard 2025-12-04T14:25:34.3941132Z 2025-12-04T14:25:34.3941537Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:21:04.063000 366622 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 366691 2025-12-04T14:25:34.3941688Z I1204 14:21:04.063000 366622 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 366692 2025-12-04T14:25:34.3941838Z I1204 14:21:04.064000 366622 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 366693 2025-12-04T14:25:34.3941987Z I1204 14:21:04.064000 366622 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 366694 2025-12-04T14:25:34.3942679Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3942735Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3943428Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3943473Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3944133Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3944175Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3944665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3944714Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3945197Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3945242Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3945729Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3945775Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3946447Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3946490Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3946978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3947043Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3947176Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3947329Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3947609Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3947778Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3948055Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3948170Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3948437Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3948574Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3948842Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3948978Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3949247Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3949374Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3949641Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3949778Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3950361Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3950471Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3950659Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3951116Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3951221Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3951438Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3951607Z E1204 14:21:11.800000 366691 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3951737Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3951888Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3952192Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3952336Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3952610Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3952722Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3952985Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3953123Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3953386Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3953524Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3953789Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3953916Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3954184Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3954323Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3954867Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3954974Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3955160Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3955615Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3955742Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3955942Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3956097Z E1204 14:21:11.812000 366692 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3956228Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3956406Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3956681Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3956825Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3957098Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3957210Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3957476Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3957614Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3957879Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3958015Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3958278Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3958406Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3958672Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3958812Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3959352Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3959458Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3959644Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3960104Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3960256Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3960455Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3960635Z E1204 14:21:11.842000 366693 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3960763Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3960914Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3961190Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3961332Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3961605Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3961718Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3961983Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3962122Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3962386Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3962522Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3962786Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3962911Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3963181Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3963317Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3963857Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1254096896 and is now 2820669440. 2025-12-04T14:25:34.3963962Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3964164Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3964625Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3964729Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3964951Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3965106Z E1204 14:21:11.902000 366694 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3965145Z FAILED [9.0187s] [ 12%] 2025-12-04T14:25:34.3965147Z 2025-12-04T14:25:34.3965202Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3965379Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3965425Z Traceback (most recent call last): 2025-12-04T14:25:34.3965584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3965628Z self._join_processes(fn) 2025-12-04T14:25:34.3965799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3965852Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3966026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3966071Z raise RuntimeError(error) 2025-12-04T14:25:34.3966150Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3966193Z Traceback (most recent call last): 2025-12-04T14:25:34.3966352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3966394Z getattr(self, test_name)() 2025-12-04T14:25:34.3966552Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3966585Z fn() 2025-12-04T14:25:34.3966738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3966777Z method(*args, **kwargs) 2025-12-04T14:25:34.3966926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3966966Z method(*args, **kwargs) 2025-12-04T14:25:34.3967114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3967149Z with policy(): 2025-12-04T14:25:34.3967298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3967337Z raise RuntimeError(msg) 2025-12-04T14:25:34.3967768Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3967771Z 2025-12-04T14:25:34.3967844Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3968186Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3968199Z 2025-12-04T14:25:34.3968284Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3968287Z 2025-12-04T14:25:34.3968289Z 2025-12-04T14:25:34.3968362Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.3968449Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.3968739Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a608fb4cbd6a22cb.xml - 2025-12-04T14:25:34.3968802Z =========================== short test summary info ============================ 2025-12-04T14:25:34.3969145Z FAILED [9.0187s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.3969191Z Traceback (most recent call last): 2025-12-04T14:25:34.3969353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3969395Z getattr(self, test_name)() 2025-12-04T14:25:34.3969551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3969585Z fn() 2025-12-04T14:25:34.3969735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3969774Z method(*args, **kwargs) 2025-12-04T14:25:34.3969922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3969962Z method(*args, **kwargs) 2025-12-04T14:25:34.3970108Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3970145Z with policy(): 2025-12-04T14:25:34.3970316Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3970357Z raise RuntimeError(msg) 2025-12-04T14:25:34.3970781Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3970785Z 2025-12-04T14:25:34.3970858Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3971194Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3971196Z 2025-12-04T14:25:34.3971279Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3971342Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.3971404Z ======================= 1 failed, 7 deselected in 9.16s ======================== 2025-12-04T14:25:34.3971440Z Got exit code 1 2025-12-04T14:25:34.3971478Z Retrying single test... 2025-12-04T14:25:34.3971702Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-410bbbb13ebb89f9.xml 2025-12-04T14:25:34.3971777Z ============================= test session starts ============================== 2025-12-04T14:25:34.3971909Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.3971949Z cachedir: .pytest_cache 2025-12-04T14:25:34.3972105Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.3972149Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.3972188Z configfile: pytest.ini 2025-12-04T14:25:34.3972349Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.3972732Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3972785Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.3973125Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.3973180Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.3973234Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.3973559Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3973603Z Running 1 items in this shard 2025-12-04T14:25:34.3973605Z 2025-12-04T14:25:34.3974003Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:21:15.933000 367024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 367093 2025-12-04T14:25:34.3974157Z I1204 14:21:15.934000 367024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 367094 2025-12-04T14:25:34.3974306Z I1204 14:21:15.935000 367024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 367095 2025-12-04T14:25:34.3974452Z I1204 14:21:15.935000 367024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 367096 2025-12-04T14:25:34.3975124Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3975168Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3975830Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3975872Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3976532Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3976596Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3977282Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.3977322Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.3977814Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3977863Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3978355Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3978401Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3978884Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3978930Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3979413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.3979460Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.3979591Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3979744Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3980026Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3980197Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3980477Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3980592Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3980861Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3981027Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3981295Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3981431Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3981722Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3981851Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3982119Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3982258Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3982812Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.3982920Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3983108Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3983559Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3983665Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3983865Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3984019Z E1204 14:21:23.249000 367093 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.3984148Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3984297Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3984572Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3984715Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3984990Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3985116Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3985398Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3985535Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3985799Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3985959Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3986225Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3986352Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3986619Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3986756Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3987300Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3987408Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3987595Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3988049Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3988153Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3988354Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3988508Z E1204 14:21:23.262000 367096 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.3988636Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3988785Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3989060Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3989205Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3989494Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3989619Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3989884Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3990022Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3990352Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3990492Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3990758Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3990882Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3991150Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3991287Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3991828Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3991935Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3992122Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3992579Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3992685Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3992885Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3993038Z E1204 14:21:23.329000 367095 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.3993166Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.3993316Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.3993591Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3993764Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.3994037Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3994149Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.3994435Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3994574Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3994840Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3994977Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.3995240Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3995366Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.3995632Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3995770Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.3996312Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.3996416Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3996603Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.3997054Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.3997161Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.3997358Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.3997511Z E1204 14:21:23.347000 367094 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.3997548Z FAILED [8.5200s] [100%] 2025-12-04T14:25:34.3997552Z 2025-12-04T14:25:34.3997607Z =================================== FAILURES =================================== 2025-12-04T14:25:34.3997786Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.3997853Z Traceback (most recent call last): 2025-12-04T14:25:34.3998015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.3998057Z self._join_processes(fn) 2025-12-04T14:25:34.3998229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.3998280Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.3998455Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.3998517Z raise RuntimeError(error) 2025-12-04T14:25:34.3998595Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.3998638Z Traceback (most recent call last): 2025-12-04T14:25:34.3998799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.3998840Z getattr(self, test_name)() 2025-12-04T14:25:34.3998998Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.3999031Z fn() 2025-12-04T14:25:34.3999182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3999221Z method(*args, **kwargs) 2025-12-04T14:25:34.3999368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.3999407Z method(*args, **kwargs) 2025-12-04T14:25:34.3999557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.3999593Z with policy(): 2025-12-04T14:25:34.3999742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.3999782Z raise RuntimeError(msg) 2025-12-04T14:25:34.4000255Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4000257Z 2025-12-04T14:25:34.4000331Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4000663Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4000667Z 2025-12-04T14:25:34.4000753Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4000756Z 2025-12-04T14:25:34.4000757Z 2025-12-04T14:25:34.4000830Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4000916Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4001184Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-410bbbb13ebb89f9.xml - 2025-12-04T14:25:34.4001244Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4001587Z FAILED [8.5200s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4001653Z Traceback (most recent call last): 2025-12-04T14:25:34.4001814Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4001872Z getattr(self, test_name)() 2025-12-04T14:25:34.4002028Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4002062Z fn() 2025-12-04T14:25:34.4002211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4002252Z method(*args, **kwargs) 2025-12-04T14:25:34.4002400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4002467Z method(*args, **kwargs) 2025-12-04T14:25:34.4002615Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4002653Z with policy(): 2025-12-04T14:25:34.4002805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4002844Z raise RuntimeError(msg) 2025-12-04T14:25:34.4003273Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4003275Z 2025-12-04T14:25:34.4003348Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4003683Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4003686Z 2025-12-04T14:25:34.4003771Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4003835Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4003895Z ======================= 1 failed, 14 deselected in 8.69s ======================= 2025-12-04T14:25:34.4003931Z Got exit code 1 2025-12-04T14:25:34.4003969Z Retrying single test... 2025-12-04T14:25:34.4004193Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a42ec3e48444fca3.xml 2025-12-04T14:25:34.4004249Z ============================= test session starts ============================== 2025-12-04T14:25:34.4004363Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4004402Z cachedir: .pytest_cache 2025-12-04T14:25:34.4004558Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4004604Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4004642Z configfile: pytest.ini 2025-12-04T14:25:34.4004802Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4005156Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4005205Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4005548Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4005617Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4005683Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4006006Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4006047Z Running 1 items in this shard 2025-12-04T14:25:34.4006049Z 2025-12-04T14:25:34.4006471Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:21:27.016000 367426 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 367495 2025-12-04T14:25:34.4006625Z I1204 14:21:27.016000 367426 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 367496 2025-12-04T14:25:34.4006775Z I1204 14:21:27.017000 367426 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 367497 2025-12-04T14:25:34.4006923Z I1204 14:21:27.017000 367426 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 367498 2025-12-04T14:25:34.4007595Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4007637Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4008303Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4008348Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4009012Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4009052Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4009546Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4009593Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4010301Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4010358Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4010860Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4010906Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4011413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4011460Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4011941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4011988Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4012121Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4012273Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4012553Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4012697Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4012974Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4013089Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4013359Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4013497Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4013764Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4013902Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4014167Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4014293Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4014561Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4014719Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4015274Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4015381Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4015589Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4016041Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4016149Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4016347Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4016504Z E1204 14:21:34.350000 367495 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4016632Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4016780Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4017057Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4017201Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4017477Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4017591Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4017858Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4017997Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4018263Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4018400Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4018666Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4018791Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4019069Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4019220Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4019790Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1105199104 and is now 2820669440. 2025-12-04T14:25:34.4019897Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4020084Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4020571Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4020676Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4020876Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4021029Z E1204 14:21:34.434000 367498 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4021156Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4021306Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4021580Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4021723Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4021998Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4022111Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4022378Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4022517Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4022784Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4022920Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4023185Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4023344Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4023611Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4023748Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4024313Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T14:25:34.4024420Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4024606Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4025060Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4025165Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4025364Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4025519Z E1204 14:21:34.466000 367496 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4025647Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4025795Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4026070Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4026216Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4026488Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4026602Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4026868Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4027006Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4027274Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4027410Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4027685Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4027823Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4028090Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4028249Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4028787Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1256194048 and is now 2820669440. 2025-12-04T14:25:34.4028893Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4029080Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4029536Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4029643Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4029844Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4029997Z E1204 14:21:34.476000 367497 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4030035Z FAILED [8.6179s] [100%] 2025-12-04T14:25:34.4030037Z 2025-12-04T14:25:34.4030091Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4030305Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4030351Z Traceback (most recent call last): 2025-12-04T14:25:34.4030512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4030555Z self._join_processes(fn) 2025-12-04T14:25:34.4030726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4030778Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4030952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4030994Z raise RuntimeError(error) 2025-12-04T14:25:34.4031073Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4031116Z Traceback (most recent call last): 2025-12-04T14:25:34.4031277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4031318Z getattr(self, test_name)() 2025-12-04T14:25:34.4031475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4031521Z fn() 2025-12-04T14:25:34.4031687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4031726Z method(*args, **kwargs) 2025-12-04T14:25:34.4031875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4031913Z method(*args, **kwargs) 2025-12-04T14:25:34.4032061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4032096Z with policy(): 2025-12-04T14:25:34.4032272Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4032311Z raise RuntimeError(msg) 2025-12-04T14:25:34.4032736Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4032740Z 2025-12-04T14:25:34.4032815Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4033145Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4033147Z 2025-12-04T14:25:34.4033233Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4033237Z 2025-12-04T14:25:34.4033238Z 2025-12-04T14:25:34.4033312Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4033399Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4033665Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a42ec3e48444fca3.xml - 2025-12-04T14:25:34.4033726Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4034069Z FAILED [8.6179s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4034114Z Traceback (most recent call last): 2025-12-04T14:25:34.4034279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4034320Z getattr(self, test_name)() 2025-12-04T14:25:34.4034477Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4034512Z fn() 2025-12-04T14:25:34.4034660Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4034699Z method(*args, **kwargs) 2025-12-04T14:25:34.4034847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4034885Z method(*args, **kwargs) 2025-12-04T14:25:34.4035032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4035066Z with policy(): 2025-12-04T14:25:34.4035216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4035255Z raise RuntimeError(msg) 2025-12-04T14:25:34.4035691Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4035711Z 2025-12-04T14:25:34.4035785Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4036114Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4036117Z 2025-12-04T14:25:34.4036222Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4036285Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4036348Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T14:25:34.4036383Z Got exit code 1 2025-12-04T14:25:34.4036665Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4036791Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4037015Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-233ab525eb1486de.xml 2025-12-04T14:25:34.4037071Z ============================= test session starts ============================== 2025-12-04T14:25:34.4037185Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4037226Z cachedir: .pytest_cache 2025-12-04T14:25:34.4037382Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4037428Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4037468Z configfile: pytest.ini 2025-12-04T14:25:34.4037627Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4037981Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4038028Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4038371Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4038427Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4038480Z collected 15 items / 8 deselected / 7 selected 2025-12-04T14:25:34.4038531Z stepcurrent: skipping 8 already run items. 2025-12-04T14:25:34.4038573Z Running 7 items in this shard 2025-12-04T14:25:34.4038575Z 2025-12-04T14:25:34.4038991Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:21:38.323000 367828 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 367897 2025-12-04T14:25:34.4039145Z I1204 14:21:38.324000 367828 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 367898 2025-12-04T14:25:34.4039295Z I1204 14:21:38.325000 367828 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 367899 2025-12-04T14:25:34.4039454Z I1204 14:21:38.325000 367828 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 367900 2025-12-04T14:25:34.4040140Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4040221Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4040912Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4040957Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4041623Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4041665Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4042330Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4042371Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4042864Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4042911Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4043395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4043441Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4043926Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4043973Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4044453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4044525Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4045216Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4045257Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4045923Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4045965Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4046634Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4046678Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4047166Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4047226Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4047947Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4048017Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4048529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4048599Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4049114Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4049213Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4049728Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4052343Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4052592Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4052683Z local_shape = tensor.shape 2025-12-04T14:25:34.4052914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4052954Z tensor.shape, 2025-12-04T14:25:34.4053183Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4053218Z tensor.dtype, 2025-12-04T14:25:34.4053446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4053487Z local_shape = tensor.shape 2025-12-04T14:25:34.4053717Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4053752Z tensor.shape, 2025-12-04T14:25:34.4053980Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4054017Z tensor.dtype, 2025-12-04T14:25:34.4054246Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4054286Z local_shape = tensor.shape 2025-12-04T14:25:34.4054514Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4054549Z tensor.shape, 2025-12-04T14:25:34.4054779Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4054813Z tensor.dtype, 2025-12-04T14:25:34.4055041Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4055084Z local_shape = tensor.shape 2025-12-04T14:25:34.4055310Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4055346Z tensor.shape, 2025-12-04T14:25:34.4055572Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4055607Z tensor.dtype, 2025-12-04T14:25:34.4055744Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4055902Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4056184Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4056357Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4056635Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4056752Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4057037Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4057180Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4057447Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4057584Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4057848Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4057975Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4058243Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4058381Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4058946Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4059057Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4059245Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4059715Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4059821Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4060025Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4060218Z E1204 14:21:45.718000 367897 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4060349Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4060515Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4060809Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4060953Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4061250Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4061366Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4061632Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4061771Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4062035Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4062173Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4062438Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4062565Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4062834Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4062972Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4063530Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1258291200 and is now 2850029568. 2025-12-04T14:25:34.4063637Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4063825Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4064288Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4064392Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4064594Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4064759Z E1204 14:21:45.779000 367900 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4064895Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4065044Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4065319Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4065462Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4065753Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4065868Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4066136Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4066274Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4066538Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4066675Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4066940Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4067067Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4067333Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4067471Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4068024Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4068132Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4068320Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4068785Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4068889Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4069104Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4069268Z E1204 14:21:45.791000 367899 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4069394Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4069543Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4069838Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4069981Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4070293Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4070407Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4070679Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4070819Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4071085Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4071221Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4071487Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4071611Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4071880Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4072017Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4072569Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1262485504 and is now 2850029568. 2025-12-04T14:25:34.4072675Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4072862Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4073327Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4073458Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4073656Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4073810Z E1204 14:21:45.816000 367898 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4073849Z FAILED [8.6217s] [ 14%] 2025-12-04T14:25:34.4073852Z 2025-12-04T14:25:34.4073908Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4074127Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4074175Z Traceback (most recent call last): 2025-12-04T14:25:34.4074335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4074379Z self._join_processes(fn) 2025-12-04T14:25:34.4074548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4074602Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4074776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4074819Z raise RuntimeError(error) 2025-12-04T14:25:34.4074897Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4074944Z Traceback (most recent call last): 2025-12-04T14:25:34.4075102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4075145Z getattr(self, test_name)() 2025-12-04T14:25:34.4075303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4075338Z fn() 2025-12-04T14:25:34.4075487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4075527Z method(*args, **kwargs) 2025-12-04T14:25:34.4075674Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4075714Z method(*args, **kwargs) 2025-12-04T14:25:34.4075861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4075900Z with policy(): 2025-12-04T14:25:34.4076051Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4076091Z raise RuntimeError(msg) 2025-12-04T14:25:34.4076534Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4076536Z 2025-12-04T14:25:34.4076610Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4076960Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4076962Z 2025-12-04T14:25:34.4077049Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4077061Z 2025-12-04T14:25:34.4077063Z 2025-12-04T14:25:34.4077140Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4077235Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4077505Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-233ab525eb1486de.xml - 2025-12-04T14:25:34.4077566Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4077951Z FAILED [8.6217s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4077997Z Traceback (most recent call last): 2025-12-04T14:25:34.4078159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4078202Z getattr(self, test_name)() 2025-12-04T14:25:34.4078359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4078394Z fn() 2025-12-04T14:25:34.4078541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4078580Z method(*args, **kwargs) 2025-12-04T14:25:34.4078728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4078766Z method(*args, **kwargs) 2025-12-04T14:25:34.4078913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4078950Z with policy(): 2025-12-04T14:25:34.4079098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4079139Z raise RuntimeError(msg) 2025-12-04T14:25:34.4079577Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4079579Z 2025-12-04T14:25:34.4079653Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4080002Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4080004Z 2025-12-04T14:25:34.4080089Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4080152Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4080251Z ======================= 1 failed, 8 deselected in 8.78s ======================== 2025-12-04T14:25:34.4080287Z Got exit code 1 2025-12-04T14:25:34.4080325Z Retrying single test... 2025-12-04T14:25:34.4080549Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-184a82241e45e895.xml 2025-12-04T14:25:34.4080605Z ============================= test session starts ============================== 2025-12-04T14:25:34.4080718Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4080759Z cachedir: .pytest_cache 2025-12-04T14:25:34.4080915Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4080979Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4081033Z configfile: pytest.ini 2025-12-04T14:25:34.4081194Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4081550Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4081600Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4081978Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4082034Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4082091Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4082429Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4082472Z Running 1 items in this shard 2025-12-04T14:25:34.4082474Z 2025-12-04T14:25:34.4082890Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:21:49.690000 368230 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 368299 2025-12-04T14:25:34.4083047Z I1204 14:21:49.691000 368230 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 368300 2025-12-04T14:25:34.4083198Z I1204 14:21:49.691000 368230 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 368301 2025-12-04T14:25:34.4083348Z I1204 14:21:49.692000 368230 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 368302 2025-12-04T14:25:34.4084028Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4084071Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4084740Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4084785Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4085447Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4085488Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4086158Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4086209Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4086728Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4086776Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4087263Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4087309Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4087793Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4087839Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4088316Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4088362Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4089026Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4089068Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4089726Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4089766Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4090470Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4090538Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4091199Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4091240Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4091748Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4091809Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4092289Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4092346Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4092829Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4092885Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4093362Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4093414Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4093649Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4093692Z local_shape = tensor.shape 2025-12-04T14:25:34.4093926Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4093962Z tensor.shape, 2025-12-04T14:25:34.4094192Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4094227Z tensor.dtype, 2025-12-04T14:25:34.4094455Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4094498Z local_shape = tensor.shape 2025-12-04T14:25:34.4094727Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4094762Z tensor.shape, 2025-12-04T14:25:34.4095003Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4095048Z tensor.dtype, 2025-12-04T14:25:34.4095276Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4095317Z local_shape = tensor.shape 2025-12-04T14:25:34.4095544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4095579Z tensor.shape, 2025-12-04T14:25:34.4095829Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4095867Z tensor.dtype, 2025-12-04T14:25:34.4096095Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4096136Z local_shape = tensor.shape 2025-12-04T14:25:34.4096363Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4096398Z tensor.shape, 2025-12-04T14:25:34.4096625Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4096660Z tensor.dtype, 2025-12-04T14:25:34.4096796Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4096950Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4097233Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4097381Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4097658Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4097773Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4098042Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4098183Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4098450Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4098587Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4098855Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4098982Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4099260Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4099411Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4100001Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4100111Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4100328Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4100794Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4100900Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4101101Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4101256Z E1204 14:21:57.089000 368299 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4101385Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4101535Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4101810Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4101953Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4102229Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4102344Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4102610Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4102750Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4103015Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4103153Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4103419Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4103572Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4103841Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4103978Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4104564Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 803209216 and is now 2850029568. 2025-12-04T14:25:34.4104672Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4104856Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4105318Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4105423Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4105622Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4105777Z E1204 14:21:57.144000 368302 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4105905Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4106053Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4106328Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4106472Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4106747Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4106861Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4107125Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4107262Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4107527Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4107664Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4107952Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4108076Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4108342Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4108498Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4109053Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4109160Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4109345Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4109807Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4109912Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4110112Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4110301Z E1204 14:21:57.165000 368301 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4110428Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4110576Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4110852Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4110995Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4111272Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4111385Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4111650Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4111789Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4112053Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4112217Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4112481Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4112606Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4112898Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4113038Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4113593Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4113699Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4113887Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4114349Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4114455Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4114654Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4114807Z E1204 14:21:57.176000 368300 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4114849Z FAILED [8.6197s] [100%] 2025-12-04T14:25:34.4114851Z 2025-12-04T14:25:34.4114909Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4115104Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4115151Z Traceback (most recent call last): 2025-12-04T14:25:34.4115313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4115354Z self._join_processes(fn) 2025-12-04T14:25:34.4115525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4115578Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4115755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4115798Z raise RuntimeError(error) 2025-12-04T14:25:34.4115877Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4115922Z Traceback (most recent call last): 2025-12-04T14:25:34.4116096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4116158Z getattr(self, test_name)() 2025-12-04T14:25:34.4116320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4116355Z fn() 2025-12-04T14:25:34.4116510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4116550Z method(*args, **kwargs) 2025-12-04T14:25:34.4116702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4116740Z method(*args, **kwargs) 2025-12-04T14:25:34.4116909Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4116947Z with policy(): 2025-12-04T14:25:34.4117098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4117142Z raise RuntimeError(msg) 2025-12-04T14:25:34.4117580Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4117583Z 2025-12-04T14:25:34.4117662Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4118009Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4118012Z 2025-12-04T14:25:34.4118102Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4118105Z 2025-12-04T14:25:34.4118107Z 2025-12-04T14:25:34.4118183Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4118270Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4118540Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-184a82241e45e895.xml - 2025-12-04T14:25:34.4118601Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4118960Z FAILED [8.6197s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4119007Z Traceback (most recent call last): 2025-12-04T14:25:34.4119173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4119217Z getattr(self, test_name)() 2025-12-04T14:25:34.4119376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4119409Z fn() 2025-12-04T14:25:34.4119557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4119595Z method(*args, **kwargs) 2025-12-04T14:25:34.4119743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4119783Z method(*args, **kwargs) 2025-12-04T14:25:34.4119931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4119979Z with policy(): 2025-12-04T14:25:34.4120130Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4120217Z raise RuntimeError(msg) 2025-12-04T14:25:34.4120660Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4120663Z 2025-12-04T14:25:34.4120735Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4121113Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4121116Z 2025-12-04T14:25:34.4121204Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4121267Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4121329Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T14:25:34.4121365Z Got exit code 1 2025-12-04T14:25:34.4121404Z Retrying single test... 2025-12-04T14:25:34.4121626Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-734392bac176fca7.xml 2025-12-04T14:25:34.4121682Z ============================= test session starts ============================== 2025-12-04T14:25:34.4121794Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4121834Z cachedir: .pytest_cache 2025-12-04T14:25:34.4121989Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4122035Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4122074Z configfile: pytest.ini 2025-12-04T14:25:34.4122234Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4122587Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4122636Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4122978Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4123035Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4123090Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4123428Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4123471Z Running 1 items in this shard 2025-12-04T14:25:34.4123473Z 2025-12-04T14:25:34.4123890Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 14:22:00.927000 368632 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 368701 2025-12-04T14:25:34.4124043Z I1204 14:22:00.928000 368632 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 368702 2025-12-04T14:25:34.4124207Z I1204 14:22:00.928000 368632 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 368703 2025-12-04T14:25:34.4124367Z I1204 14:22:00.929000 368632 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 368704 2025-12-04T14:25:34.4125063Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4125106Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4125769Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4125812Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4126470Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4126511Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4127168Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4127210Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4127701Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4127751Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4128238Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4128285Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4128767Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4128823Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4129318Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4129362Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4130049Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4130092Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4130800Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4130841Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4131506Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4131548Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4132207Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4132247Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4132730Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4132788Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4133270Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4133328Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4133805Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4133895Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4134373Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4134454Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4134687Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4134731Z local_shape = tensor.shape 2025-12-04T14:25:34.4134963Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4135005Z local_shape = tensor.shape 2025-12-04T14:25:34.4135233Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4135269Z tensor.shape, 2025-12-04T14:25:34.4135498Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4135536Z tensor.shape, 2025-12-04T14:25:34.4135767Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4135803Z tensor.dtype, 2025-12-04T14:25:34.4136032Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4136067Z tensor.dtype, 2025-12-04T14:25:34.4136296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4136335Z local_shape = tensor.shape 2025-12-04T14:25:34.4136563Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4136599Z tensor.shape, 2025-12-04T14:25:34.4136826Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4136861Z tensor.dtype, 2025-12-04T14:25:34.4137091Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4137130Z local_shape = tensor.shape 2025-12-04T14:25:34.4137357Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4137392Z tensor.shape, 2025-12-04T14:25:34.4137621Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4137656Z tensor.dtype, 2025-12-04T14:25:34.4137790Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4137954Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4138249Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4138394Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4138669Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4138810Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4139077Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4139218Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4139482Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4139620Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4139884Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4140012Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4140322Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4140461Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4141027Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4141135Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4141325Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4141791Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4141897Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4142099Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4142255Z E1204 14:22:08.346000 368703 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4142412Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4142561Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4142839Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4142981Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4143280Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4143393Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4143659Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4143797Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4144061Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4144199Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4144464Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4144590Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4144855Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4144995Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4145555Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 803209216 and is now 2850029568. 2025-12-04T14:25:34.4145662Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4145848Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4146310Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4146414Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4146625Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4146790Z E1204 14:22:08.383000 368704 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4146917Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4147067Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4147377Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4147522Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4147798Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4147910Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4148176Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4148312Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4148576Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4148713Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4148978Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4149103Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4149368Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4149508Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4150064Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4150203Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4150388Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4150852Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4150987Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4151186Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4151340Z E1204 14:22:08.394000 368701 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4151465Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4151646Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4151922Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4152069Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4152347Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4152459Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4152727Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4152863Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4153129Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4153267Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4153532Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4153657Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4153925Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4154062Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4154621Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4154725Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4154912Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4155386Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4155499Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4155698Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4155851Z E1204 14:22:08.429000 368702 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4155914Z FAILED [8.7185s] [100%] 2025-12-04T14:25:34.4155916Z 2025-12-04T14:25:34.4155972Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4156162Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4156209Z Traceback (most recent call last): 2025-12-04T14:25:34.4156368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4156412Z self._join_processes(fn) 2025-12-04T14:25:34.4156581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4156634Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4156810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4156853Z raise RuntimeError(error) 2025-12-04T14:25:34.4156930Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4156974Z Traceback (most recent call last): 2025-12-04T14:25:34.4157132Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4157174Z getattr(self, test_name)() 2025-12-04T14:25:34.4157330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4157364Z fn() 2025-12-04T14:25:34.4157513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4157553Z method(*args, **kwargs) 2025-12-04T14:25:34.4157701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4157741Z method(*args, **kwargs) 2025-12-04T14:25:34.4157887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4157924Z with policy(): 2025-12-04T14:25:34.4158074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4158114Z raise RuntimeError(msg) 2025-12-04T14:25:34.4158549Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4158551Z 2025-12-04T14:25:34.4158626Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4158974Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4158987Z 2025-12-04T14:25:34.4159085Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4159087Z 2025-12-04T14:25:34.4159089Z 2025-12-04T14:25:34.4159164Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4159248Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4159516Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-734392bac176fca7.xml - 2025-12-04T14:25:34.4159575Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4159951Z FAILED [8.7185s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4159998Z Traceback (most recent call last): 2025-12-04T14:25:34.4160160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4160236Z getattr(self, test_name)() 2025-12-04T14:25:34.4160393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4160426Z fn() 2025-12-04T14:25:34.4160575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4160614Z method(*args, **kwargs) 2025-12-04T14:25:34.4160763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4160802Z method(*args, **kwargs) 2025-12-04T14:25:34.4160949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4160986Z with policy(): 2025-12-04T14:25:34.4161135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4161175Z raise RuntimeError(msg) 2025-12-04T14:25:34.4161616Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4161618Z 2025-12-04T14:25:34.4161693Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4162039Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4162043Z 2025-12-04T14:25:34.4162128Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4162191Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4162252Z ======================= 1 failed, 14 deselected in 8.87s ======================= 2025-12-04T14:25:34.4162286Z Got exit code 1 2025-12-04T14:25:34.4162575Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4162703Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4162926Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f98ec4c96d1d5000.xml 2025-12-04T14:25:34.4163014Z ============================= test session starts ============================== 2025-12-04T14:25:34.4163125Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4163166Z cachedir: .pytest_cache 2025-12-04T14:25:34.4163320Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4163365Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4163404Z configfile: pytest.ini 2025-12-04T14:25:34.4163607Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4163966Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4164018Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4164362Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4164417Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4164469Z collected 15 items / 9 deselected / 6 selected 2025-12-04T14:25:34.4164520Z stepcurrent: skipping 9 already run items. 2025-12-04T14:25:34.4164561Z Running 6 items in this shard 2025-12-04T14:25:34.4164563Z 2025-12-04T14:25:34.4164979Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:22:12.485000 369034 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 369103 2025-12-04T14:25:34.4165134Z I1204 14:22:12.486000 369034 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 369104 2025-12-04T14:25:34.4165283Z I1204 14:22:12.486000 369034 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 369105 2025-12-04T14:25:34.4165431Z I1204 14:22:12.487000 369034 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 369106 2025-12-04T14:25:34.4166110Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4166155Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4166822Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4166864Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4167529Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4167594Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4168259Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4168320Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4168814Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4168863Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4169350Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4169397Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4169878Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4169925Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4170446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4170491Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4171156Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4171198Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4171863Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4171904Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4172559Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4172631Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4173319Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4173361Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4173848Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4173905Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4174386Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4174442Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4174924Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4174979Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4175460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4175514Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4175748Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4175792Z local_shape = tensor.shape 2025-12-04T14:25:34.4176023Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4176059Z tensor.shape, 2025-12-04T14:25:34.4176289Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4176331Z local_shape = tensor.shape 2025-12-04T14:25:34.4176562Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4176608Z tensor.dtype, 2025-12-04T14:25:34.4176836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4177001Z tensor.shape, 2025-12-04T14:25:34.4177229Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4177265Z tensor.dtype, 2025-12-04T14:25:34.4177493Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4177555Z local_shape = tensor.shape 2025-12-04T14:25:34.4177784Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4177820Z tensor.shape, 2025-12-04T14:25:34.4178051Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4178086Z tensor.dtype, 2025-12-04T14:25:34.4178315Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4178355Z local_shape = tensor.shape 2025-12-04T14:25:34.4178583Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4178619Z tensor.shape, 2025-12-04T14:25:34.4178849Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4178884Z tensor.dtype, 2025-12-04T14:25:34.4179020Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4179173Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4179452Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4179597Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4179874Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4179988Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4180297Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4180438Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4180705Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4180844Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4181112Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4181274Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4181541Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4181679Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4182261Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4182370Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4182559Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4183025Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4183132Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4183333Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4183489Z E1204 14:22:20.326000 369106 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4183618Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4183768Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4184046Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4184189Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4184464Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4184578Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4184845Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4184982Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4185249Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4185399Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4185674Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4185800Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4186068Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4186228Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4186781Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4186888Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4187074Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4187539Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4187647Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4187847Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4188003Z E1204 14:22:20.332000 369104 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4188129Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4188281Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4188559Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4188704Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4188977Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4189089Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4189357Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4189493Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4189768Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4189916Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4190212Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4190337Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4190634Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4190773Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4191329Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4191434Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4191622Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4192082Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4192188Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4192388Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4192540Z E1204 14:22:20.379000 369103 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4192672Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4192822Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4193102Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4193245Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4193517Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4193630Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4193895Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4194056Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4194320Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4194457Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4194739Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4194864Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4195132Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4195271Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4195824Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4195928Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4196114Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4196575Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4196789Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4196994Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4197146Z E1204 14:22:20.397000 369105 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4197186Z FAILED [9.1200s] [ 16%] 2025-12-04T14:25:34.4197190Z 2025-12-04T14:25:34.4197244Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4197433Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4197478Z Traceback (most recent call last): 2025-12-04T14:25:34.4197642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4197684Z self._join_processes(fn) 2025-12-04T14:25:34.4197856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4197909Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4198084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4198159Z raise RuntimeError(error) 2025-12-04T14:25:34.4198237Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4198280Z Traceback (most recent call last): 2025-12-04T14:25:34.4198439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4198480Z getattr(self, test_name)() 2025-12-04T14:25:34.4198636Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4198669Z fn() 2025-12-04T14:25:34.4198839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4198880Z method(*args, **kwargs) 2025-12-04T14:25:34.4199026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4199067Z method(*args, **kwargs) 2025-12-04T14:25:34.4199215Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4199251Z with policy(): 2025-12-04T14:25:34.4199400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4199440Z raise RuntimeError(msg) 2025-12-04T14:25:34.4199882Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4199884Z 2025-12-04T14:25:34.4199958Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4200336Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4200339Z 2025-12-04T14:25:34.4200429Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4200431Z 2025-12-04T14:25:34.4200433Z 2025-12-04T14:25:34.4200507Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4200591Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4200862Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f98ec4c96d1d5000.xml - 2025-12-04T14:25:34.4200921Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4201276Z FAILED [9.1200s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4201321Z Traceback (most recent call last): 2025-12-04T14:25:34.4201484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4201525Z getattr(self, test_name)() 2025-12-04T14:25:34.4201685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4201718Z fn() 2025-12-04T14:25:34.4201869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4201907Z method(*args, **kwargs) 2025-12-04T14:25:34.4202074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4202126Z method(*args, **kwargs) 2025-12-04T14:25:34.4202277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4202312Z with policy(): 2025-12-04T14:25:34.4202462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4202501Z raise RuntimeError(msg) 2025-12-04T14:25:34.4202964Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4202968Z 2025-12-04T14:25:34.4203041Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4203385Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4203388Z 2025-12-04T14:25:34.4203475Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4203537Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4203599Z ======================= 1 failed, 9 deselected in 9.28s ======================== 2025-12-04T14:25:34.4203635Z Got exit code 1 2025-12-04T14:25:34.4203674Z Retrying single test... 2025-12-04T14:25:34.4203897Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-93d56b8177829149.xml 2025-12-04T14:25:34.4203955Z ============================= test session starts ============================== 2025-12-04T14:25:34.4204067Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4204108Z cachedir: .pytest_cache 2025-12-04T14:25:34.4204262Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4204307Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4204346Z configfile: pytest.ini 2025-12-04T14:25:34.4204509Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4204871Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4204921Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4205266Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4205321Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4205376Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4205710Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4205754Z Running 1 items in this shard 2025-12-04T14:25:34.4205756Z 2025-12-04T14:25:34.4206168Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:22:24.129000 369436 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 369505 2025-12-04T14:25:34.4206342Z I1204 14:22:24.130000 369436 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 369506 2025-12-04T14:25:34.4206492Z I1204 14:22:24.130000 369436 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 369507 2025-12-04T14:25:34.4206641Z I1204 14:22:24.131000 369436 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 369508 2025-12-04T14:25:34.4207339Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4207383Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4208044Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4208085Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4208807Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4208851Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4209511Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4209552Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4210045Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4210094Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4210617Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4210664Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4211144Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4211215Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4211700Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4211772Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4212441Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4212484Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4213143Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4213185Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4213842Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4213883Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4214542Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4214583Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4215066Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4215123Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4215602Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4215678Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4216156Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4216211Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4216707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4216763Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4217000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4217044Z local_shape = tensor.shape 2025-12-04T14:25:34.4217281Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4217317Z tensor.shape, 2025-12-04T14:25:34.4217549Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4217584Z tensor.dtype, 2025-12-04T14:25:34.4217815Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4217857Z local_shape = tensor.shape 2025-12-04T14:25:34.4218087Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4218122Z tensor.shape, 2025-12-04T14:25:34.4218350Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4218385Z tensor.dtype, 2025-12-04T14:25:34.4218615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4218655Z local_shape = tensor.shape 2025-12-04T14:25:34.4218886Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4218922Z tensor.shape, 2025-12-04T14:25:34.4219150Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4219184Z tensor.dtype, 2025-12-04T14:25:34.4219413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4219452Z local_shape = tensor.shape 2025-12-04T14:25:34.4219686Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4219720Z tensor.shape, 2025-12-04T14:25:34.4219952Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4220006Z tensor.dtype, 2025-12-04T14:25:34.4220141Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4220363Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4220645Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4220817Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4221092Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4221208Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4221475Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4221613Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4221879Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4222017Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4222286Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4222413Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4222680Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4222818Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4223377Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4223484Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4223671Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4224138Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4224257Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4224473Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4224628Z E1204 14:22:31.499000 369505 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4224758Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4224908Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4225206Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4225350Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4225625Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4225739Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4226003Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4226142Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4226407Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4226546Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4226814Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4226939Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4227207Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4227345Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4227900Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4228005Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4228191Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4228654Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4228778Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4228978Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4229134Z E1204 14:22:31.518000 369507 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4229262Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4229437Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4229714Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4229858Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4230132Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4230284Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4230554Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4230692Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4230957Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4231094Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4231358Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4231484Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4231752Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4231891Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4232442Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 803209216 and is now 2850029568. 2025-12-04T14:25:34.4232549Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4232736Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4233209Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4233330Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4233528Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4233705Z E1204 14:22:31.561000 369508 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4233834Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4233987Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4234265Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4234409Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4234685Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4234796Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4235063Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4235200Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4235470Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4235606Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4235875Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4236000Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4236271Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4236411Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4236967Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4237081Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4237276Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4237736Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4237839Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4238059Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4238216Z E1204 14:22:31.581000 369506 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4238255Z FAILED [8.7190s] [100%] 2025-12-04T14:25:34.4238257Z 2025-12-04T14:25:34.4238311Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4238499Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4238545Z Traceback (most recent call last): 2025-12-04T14:25:34.4238706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4238750Z self._join_processes(fn) 2025-12-04T14:25:34.4238920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4238973Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4239148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4239191Z raise RuntimeError(error) 2025-12-04T14:25:34.4239268Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4239312Z Traceback (most recent call last): 2025-12-04T14:25:34.4239470Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4239511Z getattr(self, test_name)() 2025-12-04T14:25:34.4239666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4239700Z fn() 2025-12-04T14:25:34.4239848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4239888Z method(*args, **kwargs) 2025-12-04T14:25:34.4240035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4240076Z method(*args, **kwargs) 2025-12-04T14:25:34.4240263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4240299Z with policy(): 2025-12-04T14:25:34.4240448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4240488Z raise RuntimeError(msg) 2025-12-04T14:25:34.4240929Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4240948Z 2025-12-04T14:25:34.4241022Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4241383Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4241385Z 2025-12-04T14:25:34.4241470Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4241473Z 2025-12-04T14:25:34.4241531Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4241575Z Traceback (most recent call last): 2025-12-04T14:25:34.4241763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4241804Z getattr(self, test_name)() 2025-12-04T14:25:34.4241960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4241995Z fn() 2025-12-04T14:25:34.4242143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4242182Z method(*args, **kwargs) 2025-12-04T14:25:34.4242328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4242366Z method(*args, **kwargs) 2025-12-04T14:25:34.4242512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4242547Z with policy(): 2025-12-04T14:25:34.4242697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4242736Z raise RuntimeError(msg) 2025-12-04T14:25:34.4243176Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4243182Z 2025-12-04T14:25:34.4243254Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4243595Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4243597Z 2025-12-04T14:25:34.4243682Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4243685Z 2025-12-04T14:25:34.4243687Z 2025-12-04T14:25:34.4243761Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4243848Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4244114Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-93d56b8177829149.xml - 2025-12-04T14:25:34.4244173Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4244526Z FAILED [8.7190s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4244570Z Traceback (most recent call last): 2025-12-04T14:25:34.4244733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4244774Z getattr(self, test_name)() 2025-12-04T14:25:34.4245184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4245238Z fn() 2025-12-04T14:25:34.4245386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4245424Z method(*args, **kwargs) 2025-12-04T14:25:34.4245571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4245609Z method(*args, **kwargs) 2025-12-04T14:25:34.4245758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4245793Z with policy(): 2025-12-04T14:25:34.4245964Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4246003Z raise RuntimeError(msg) 2025-12-04T14:25:34.4246439Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4246442Z 2025-12-04T14:25:34.4246513Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4246854Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4246856Z 2025-12-04T14:25:34.4246942Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4246944Z 2025-12-04T14:25:34.4247000Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4247044Z Traceback (most recent call last): 2025-12-04T14:25:34.4247202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4247244Z getattr(self, test_name)() 2025-12-04T14:25:34.4247399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4247434Z fn() 2025-12-04T14:25:34.4247580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4247619Z method(*args, **kwargs) 2025-12-04T14:25:34.4247767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4247806Z method(*args, **kwargs) 2025-12-04T14:25:34.4247955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4247991Z with policy(): 2025-12-04T14:25:34.4248144Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4248184Z raise RuntimeError(msg) 2025-12-04T14:25:34.4248619Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4248621Z 2025-12-04T14:25:34.4248693Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4249035Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4249059Z 2025-12-04T14:25:34.4249143Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4249207Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4249267Z ======================= 1 failed, 14 deselected in 8.88s ======================= 2025-12-04T14:25:34.4249304Z Got exit code 1 2025-12-04T14:25:34.4249342Z Retrying single test... 2025-12-04T14:25:34.4249565Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-eb237970c4202697.xml 2025-12-04T14:25:34.4249622Z ============================= test session starts ============================== 2025-12-04T14:25:34.4249756Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4249796Z cachedir: .pytest_cache 2025-12-04T14:25:34.4249957Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4250006Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4250048Z configfile: pytest.ini 2025-12-04T14:25:34.4250252Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4250613Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4250664Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4251014Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4251070Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4251126Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4251459Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4251501Z Running 1 items in this shard 2025-12-04T14:25:34.4251503Z 2025-12-04T14:25:34.4251918Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 14:22:35.257000 369838 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 369907 2025-12-04T14:25:34.4252070Z I1204 14:22:35.258000 369838 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 369908 2025-12-04T14:25:34.4252221Z I1204 14:22:35.259000 369838 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 369909 2025-12-04T14:25:34.4252369Z I1204 14:22:35.260000 369838 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 369910 2025-12-04T14:25:34.4253054Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4253097Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4253760Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4253832Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4254517Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4254560Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4255223Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4255264Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4255759Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4255808Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4256292Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4256337Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4256821Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4256867Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4257352Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4257398Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4258070Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4258133Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4258798Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4258838Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4259519Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4259562Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4260264Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4260304Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4260791Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4260850Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4261328Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4261386Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4261864Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4261920Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4262398Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4262453Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4262687Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4262751Z local_shape = tensor.shape 2025-12-04T14:25:34.4263000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4263035Z tensor.shape, 2025-12-04T14:25:34.4263264Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4263299Z tensor.dtype, 2025-12-04T14:25:34.4263529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4263597Z local_shape = tensor.shape 2025-12-04T14:25:34.4263827Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4263863Z tensor.shape, 2025-12-04T14:25:34.4264092Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4264128Z tensor.dtype, 2025-12-04T14:25:34.4264355Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4264396Z local_shape = tensor.shape 2025-12-04T14:25:34.4264626Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4264668Z local_shape = tensor.shape 2025-12-04T14:25:34.4264895Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4264933Z tensor.shape, 2025-12-04T14:25:34.4265160Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4265195Z tensor.dtype, 2025-12-04T14:25:34.4265423Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4265458Z tensor.shape, 2025-12-04T14:25:34.4265687Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4265722Z tensor.dtype, 2025-12-04T14:25:34.4265857Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4266012Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4266294Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4266441Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4266719Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4266833Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4267112Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4267260Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4267527Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4267665Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4267950Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4268077Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4268344Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4268482Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4269039Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 958398464 and is now 2850029568. 2025-12-04T14:25:34.4269147Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4269334Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4269796Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4269902Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4270106Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4270301Z E1204 14:22:43.011000 369910 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4270429Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4270580Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4270855Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4271000Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4271273Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4271400Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4271677Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4271816Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4272109Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4272246Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4272514Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4272640Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4272906Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4273043Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4273598Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T14:25:34.4273706Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4273890Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4274352Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4274456Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4274658Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4274813Z E1204 14:22:43.062000 369907 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4274940Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4275088Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4275364Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4275507Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4275792Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4275917Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4276181Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4276343Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4276608Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4276747Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4277016Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4277142Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4277411Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4277549Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4278104Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4278208Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4278394Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4278857Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4278963Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4279165Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4279320Z E1204 14:22:43.095000 369908 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4279448Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4279599Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4279876Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4280039Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4280351Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4280463Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4280755Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4280894Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4281159Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4281297Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4281561Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4281688Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4281954Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4282093Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4282644Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T14:25:34.4282748Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4282936Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4283396Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4283501Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4283699Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4283854Z E1204 14:22:43.097000 369909 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4283892Z FAILED [9.0210s] [100%] 2025-12-04T14:25:34.4283894Z 2025-12-04T14:25:34.4283948Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4284151Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4284211Z Traceback (most recent call last): 2025-12-04T14:25:34.4284371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4284413Z self._join_processes(fn) 2025-12-04T14:25:34.4284585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4284636Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4284831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4284875Z raise RuntimeError(error) 2025-12-04T14:25:34.4284954Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4284999Z Traceback (most recent call last): 2025-12-04T14:25:34.4285159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4285200Z getattr(self, test_name)() 2025-12-04T14:25:34.4285357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4285390Z fn() 2025-12-04T14:25:34.4285539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4285578Z method(*args, **kwargs) 2025-12-04T14:25:34.4285727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4285765Z method(*args, **kwargs) 2025-12-04T14:25:34.4285912Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4285948Z with policy(): 2025-12-04T14:25:34.4286098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4286137Z raise RuntimeError(msg) 2025-12-04T14:25:34.4286579Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 958398464 and is now 2850029568. 2025-12-04T14:25:34.4286581Z 2025-12-04T14:25:34.4286654Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4286997Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4287001Z 2025-12-04T14:25:34.4287088Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4287090Z 2025-12-04T14:25:34.4287092Z 2025-12-04T14:25:34.4287164Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4287250Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4287515Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-eb237970c4202697.xml - 2025-12-04T14:25:34.4287575Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4287927Z FAILED [9.0210s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4288001Z Traceback (most recent call last): 2025-12-04T14:25:34.4288163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4288205Z getattr(self, test_name)() 2025-12-04T14:25:34.4288362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4288396Z fn() 2025-12-04T14:25:34.4288545Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4288584Z method(*args, **kwargs) 2025-12-04T14:25:34.4288753Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4288793Z method(*args, **kwargs) 2025-12-04T14:25:34.4288940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4288978Z with policy(): 2025-12-04T14:25:34.4289127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4289167Z raise RuntimeError(msg) 2025-12-04T14:25:34.4289605Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 958398464 and is now 2850029568. 2025-12-04T14:25:34.4289607Z 2025-12-04T14:25:34.4289681Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4290024Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4290028Z 2025-12-04T14:25:34.4290112Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4290212Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4290273Z ======================= 1 failed, 14 deselected in 9.18s ======================= 2025-12-04T14:25:34.4290309Z Got exit code 1 2025-12-04T14:25:34.4290598Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4290724Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4290948Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3914673ed7d527c4.xml 2025-12-04T14:25:34.4291006Z ============================= test session starts ============================== 2025-12-04T14:25:34.4291117Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4291157Z cachedir: .pytest_cache 2025-12-04T14:25:34.4291316Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4291360Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4291399Z configfile: pytest.ini 2025-12-04T14:25:34.4291561Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4291920Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4291998Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4292341Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4292395Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4292450Z collected 15 items / 10 deselected / 5 selected 2025-12-04T14:25:34.4292501Z stepcurrent: skipping 10 already run items. 2025-12-04T14:25:34.4292543Z Running 5 items in this shard 2025-12-04T14:25:34.4292545Z 2025-12-04T14:25:34.4292990Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:22:46.986000 370240 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 370309 2025-12-04T14:25:34.4293147Z I1204 14:22:46.987000 370240 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 370310 2025-12-04T14:25:34.4293299Z I1204 14:22:46.987000 370240 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 370311 2025-12-04T14:25:34.4293448Z I1204 14:22:46.988000 370240 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 370312 2025-12-04T14:25:34.4294128Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4294171Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4294839Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4294879Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4295543Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4295585Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4296251Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4296293Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4296784Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4296854Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4297341Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4297404Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4297887Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4297933Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4298418Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4298463Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4299137Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4299179Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4299841Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4299882Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4300580Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4300621Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4301285Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4301338Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4301835Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4301894Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4302393Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4302452Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4302936Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4302992Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4303474Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4305955Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4306198Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4306246Z local_shape = tensor.shape 2025-12-04T14:25:34.4306476Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4306514Z tensor.shape, 2025-12-04T14:25:34.4306742Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4306786Z tensor.dtype, 2025-12-04T14:25:34.4307013Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4307056Z local_shape = tensor.shape 2025-12-04T14:25:34.4307284Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4307324Z local_shape = tensor.shape 2025-12-04T14:25:34.4307551Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4307587Z tensor.shape, 2025-12-04T14:25:34.4307813Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4307850Z tensor.shape, 2025-12-04T14:25:34.4308078Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4308134Z tensor.dtype, 2025-12-04T14:25:34.4308380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4308415Z tensor.dtype, 2025-12-04T14:25:34.4308643Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4308683Z local_shape = tensor.shape 2025-12-04T14:25:34.4308912Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4308968Z tensor.shape, 2025-12-04T14:25:34.4309198Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4309233Z tensor.dtype, 2025-12-04T14:25:34.4309372Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4309527Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4309812Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4309959Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4310299Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4310419Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4310691Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4310835Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4311104Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4311245Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4311514Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4311644Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4311913Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4312051Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4312618Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4312768Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4312957Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4313451Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4313558Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4313762Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4313918Z E1204 14:22:54.322000 370309 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4314049Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4314197Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4314475Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4314618Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4314892Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4315005Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4315270Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4315409Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4315678Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4315815Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4316080Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4316206Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4316471Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4316610Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4317172Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4317291Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4317478Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4317971Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4318080Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4318280Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4318435Z E1204 14:22:54.338000 370310 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4318562Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4318714Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4318992Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4319136Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4319412Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4319523Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4319791Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4319928Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4320248Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4320386Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4320653Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4320779Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4321049Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4321207Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4321773Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4321879Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4322087Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4322551Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4322658Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4322858Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4323013Z E1204 14:22:54.353000 370311 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4323141Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4323291Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4323566Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4323709Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4323981Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4324095Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4324360Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4324500Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4324768Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4324906Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4325175Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4325300Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4325577Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4325728Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4326298Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 803209216 and is now 2826960896. 2025-12-04T14:25:34.4326405Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4326592Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4327054Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4327158Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4327360Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4327514Z E1204 14:22:54.386000 370312 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4327554Z FAILED [8.5193s] [ 20%] 2025-12-04T14:25:34.4327557Z 2025-12-04T14:25:34.4327615Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4327802Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4327849Z Traceback (most recent call last): 2025-12-04T14:25:34.4328011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4328054Z self._join_processes(fn) 2025-12-04T14:25:34.4328226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4328279Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4328454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4328499Z raise RuntimeError(error) 2025-12-04T14:25:34.4328577Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4328621Z Traceback (most recent call last): 2025-12-04T14:25:34.4328779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4328820Z getattr(self, test_name)() 2025-12-04T14:25:34.4328976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4329011Z fn() 2025-12-04T14:25:34.4329162Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4329202Z method(*args, **kwargs) 2025-12-04T14:25:34.4329351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4329412Z method(*args, **kwargs) 2025-12-04T14:25:34.4329574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4329611Z with policy(): 2025-12-04T14:25:34.4329760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4329800Z raise RuntimeError(msg) 2025-12-04T14:25:34.4330305Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4330308Z 2025-12-04T14:25:34.4330383Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4330729Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4330733Z 2025-12-04T14:25:34.4330817Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4330820Z 2025-12-04T14:25:34.4330821Z 2025-12-04T14:25:34.4330897Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4330982Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4331252Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-3914673ed7d527c4.xml - 2025-12-04T14:25:34.4331312Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4331669Z FAILED [8.5193s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4331714Z Traceback (most recent call last): 2025-12-04T14:25:34.4331879Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4331920Z getattr(self, test_name)() 2025-12-04T14:25:34.4332078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4332112Z fn() 2025-12-04T14:25:34.4332263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4332301Z method(*args, **kwargs) 2025-12-04T14:25:34.4332452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4332492Z method(*args, **kwargs) 2025-12-04T14:25:34.4332640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4332676Z with policy(): 2025-12-04T14:25:34.4332824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4332865Z raise RuntimeError(msg) 2025-12-04T14:25:34.4333303Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4333320Z 2025-12-04T14:25:34.4333393Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4333747Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4333749Z 2025-12-04T14:25:34.4333833Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4333895Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4333957Z ======================= 1 failed, 10 deselected in 8.67s ======================= 2025-12-04T14:25:34.4333992Z Got exit code 1 2025-12-04T14:25:34.4334032Z Retrying single test... 2025-12-04T14:25:34.4334273Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-45159ce44cc92740.xml 2025-12-04T14:25:34.4334332Z ============================= test session starts ============================== 2025-12-04T14:25:34.4334446Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4334487Z cachedir: .pytest_cache 2025-12-04T14:25:34.4334643Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4334688Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4334727Z configfile: pytest.ini 2025-12-04T14:25:34.4334891Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4335250Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4335300Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4335643Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4335698Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4335753Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4336086Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4336130Z Running 1 items in this shard 2025-12-04T14:25:34.4336132Z 2025-12-04T14:25:34.4336549Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:22:58.090000 370642 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 370711 2025-12-04T14:25:34.4336706Z I1204 14:22:58.090000 370642 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 370712 2025-12-04T14:25:34.4336856Z I1204 14:22:58.091000 370642 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 370713 2025-12-04T14:25:34.4337003Z I1204 14:22:58.091000 370642 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 370714 2025-12-04T14:25:34.4337680Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4337744Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4338411Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4338473Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4339131Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4339174Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4339832Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4339873Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4340404Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4340453Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4340936Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4340982Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4341466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4341512Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4341998Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4342043Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4342709Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4342776Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4343473Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4343516Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4344177Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4344217Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4344885Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4344925Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4345409Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4345467Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4345950Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4346008Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4346486Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4346542Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4347029Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4347103Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4347337Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4347380Z local_shape = tensor.shape 2025-12-04T14:25:34.4347611Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4347647Z tensor.shape, 2025-12-04T14:25:34.4347895Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4347937Z local_shape = tensor.shape 2025-12-04T14:25:34.4348168Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4348204Z tensor.dtype, 2025-12-04T14:25:34.4348437Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4348472Z tensor.shape, 2025-12-04T14:25:34.4348704Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4348738Z tensor.dtype, 2025-12-04T14:25:34.4348970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4349010Z local_shape = tensor.shape 2025-12-04T14:25:34.4349239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4349281Z local_shape = tensor.shape 2025-12-04T14:25:34.4349510Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4349546Z tensor.shape, 2025-12-04T14:25:34.4349775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4349810Z tensor.shape, 2025-12-04T14:25:34.4350040Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4350075Z tensor.dtype, 2025-12-04T14:25:34.4350334Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4350371Z tensor.dtype, 2025-12-04T14:25:34.4350505Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4350660Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4350940Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4351087Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4351363Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4351507Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4351773Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4351913Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4352207Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4352346Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4352613Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4352741Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4353008Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4353149Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4353715Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4353827Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4354015Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4354488Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4354595Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4354799Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4354955Z E1204 14:23:05.335000 370711 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4355084Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4355234Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4355513Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4355667Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4355951Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4356065Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4356331Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4356489Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4356755Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4356896Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4357161Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4357286Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4357556Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4357693Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4358249Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4358354Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4358543Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4359003Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4359109Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4359311Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4359464Z E1204 14:23:05.352000 370713 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4359592Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4359742Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4360030Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4360218Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4360492Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4360603Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4360901Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4361040Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4361306Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4361444Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4361711Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4361838Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4362105Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4362245Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4362797Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1107296256 and is now 2826960896. 2025-12-04T14:25:34.4362904Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4363089Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4363554Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4363661Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4363860Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4364017Z E1204 14:23:05.414000 370714 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4364144Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4364306Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4364596Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4364738Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4365032Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4365143Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4365411Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4365548Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4365815Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4365951Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4366220Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4366346Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4366615Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4366753Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4367305Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4367411Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4367596Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4368058Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4368163Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4368362Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4368528Z E1204 14:23:05.434000 370712 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4368578Z FAILED [8.5188s] [100%] 2025-12-04T14:25:34.4368580Z 2025-12-04T14:25:34.4368636Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4368825Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4368871Z Traceback (most recent call last): 2025-12-04T14:25:34.4369031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4369073Z self._join_processes(fn) 2025-12-04T14:25:34.4369263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4369316Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4369492Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4369535Z raise RuntimeError(error) 2025-12-04T14:25:34.4369614Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4369657Z Traceback (most recent call last): 2025-12-04T14:25:34.4369814Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4369855Z getattr(self, test_name)() 2025-12-04T14:25:34.4370010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4370044Z fn() 2025-12-04T14:25:34.4370226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4370266Z method(*args, **kwargs) 2025-12-04T14:25:34.4370415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4370455Z method(*args, **kwargs) 2025-12-04T14:25:34.4370603Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4370638Z with policy(): 2025-12-04T14:25:34.4370789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4370828Z raise RuntimeError(msg) 2025-12-04T14:25:34.4371266Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4371269Z 2025-12-04T14:25:34.4371342Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4371691Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4371694Z 2025-12-04T14:25:34.4371780Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4371782Z 2025-12-04T14:25:34.4371784Z 2025-12-04T14:25:34.4371858Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4371946Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4372212Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-45159ce44cc92740.xml - 2025-12-04T14:25:34.4372285Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4372653Z FAILED [8.5188s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4372699Z Traceback (most recent call last): 2025-12-04T14:25:34.4372861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4372903Z getattr(self, test_name)() 2025-12-04T14:25:34.4373084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4373119Z fn() 2025-12-04T14:25:34.4373268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4373308Z method(*args, **kwargs) 2025-12-04T14:25:34.4373456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4373496Z method(*args, **kwargs) 2025-12-04T14:25:34.4373643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4373678Z with policy(): 2025-12-04T14:25:34.4373828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4373867Z raise RuntimeError(msg) 2025-12-04T14:25:34.4374304Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4374308Z 2025-12-04T14:25:34.4374380Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4374723Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4374725Z 2025-12-04T14:25:34.4374809Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4374871Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4374931Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T14:25:34.4374968Z Got exit code 1 2025-12-04T14:25:34.4375007Z Retrying single test... 2025-12-04T14:25:34.4375230Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-551ba22b24efc1d9.xml 2025-12-04T14:25:34.4375287Z ============================= test session starts ============================== 2025-12-04T14:25:34.4375400Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4375439Z cachedir: .pytest_cache 2025-12-04T14:25:34.4375595Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4375638Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4375678Z configfile: pytest.ini 2025-12-04T14:25:34.4375837Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4376192Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4376268Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4376610Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4376665Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4376720Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4377074Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4377117Z Running 1 items in this shard 2025-12-04T14:25:34.4377120Z 2025-12-04T14:25:34.4377534Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 14:23:09.143000 371044 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 371113 2025-12-04T14:25:34.4377686Z I1204 14:23:09.143000 371044 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 371114 2025-12-04T14:25:34.4377837Z I1204 14:23:09.144000 371044 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 371115 2025-12-04T14:25:34.4377985Z I1204 14:23:09.144000 371044 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 371116 2025-12-04T14:25:34.4378657Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4378701Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4379364Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4379405Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4380066Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4380108Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4380802Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4380857Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4381351Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4381414Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4382373Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4382421Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4382904Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4382951Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4383440Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4383486Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4384159Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4384200Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4384861Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4384903Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4385566Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4385607Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4386094Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4386173Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4386651Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4386708Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4387396Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4387439Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4387920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4387975Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4388458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4388514Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4388748Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4388791Z local_shape = tensor.shape 2025-12-04T14:25:34.4389020Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4389056Z tensor.shape, 2025-12-04T14:25:34.4389285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4389322Z tensor.dtype, 2025-12-04T14:25:34.4389550Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4389593Z local_shape = tensor.shape 2025-12-04T14:25:34.4389821Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4389856Z tensor.shape, 2025-12-04T14:25:34.4390082Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4390118Z tensor.dtype, 2025-12-04T14:25:34.4390379Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4390439Z local_shape = tensor.shape 2025-12-04T14:25:34.4390666Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4390718Z tensor.shape, 2025-12-04T14:25:34.4390947Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4390983Z tensor.dtype, 2025-12-04T14:25:34.4391210Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4391250Z local_shape = tensor.shape 2025-12-04T14:25:34.4391500Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4391537Z tensor.shape, 2025-12-04T14:25:34.4391764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4391799Z tensor.dtype, 2025-12-04T14:25:34.4391934Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4392086Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4392366Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4392510Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4392786Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4392900Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4393170Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4393309Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4393576Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4393716Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4393985Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4394113Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4394379Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4394519Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4395075Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T14:25:34.4395210Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4395397Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4395875Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4395985Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4396186Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4396343Z E1204 14:23:16.477000 371116 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4396470Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4396621Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4396897Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4397041Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4397314Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4397426Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4397693Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4397829Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4398096Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4398233Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4398499Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4398626Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4398894Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4399045Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4399606Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4399711Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4399916Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4400425Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4400532Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4400730Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4400884Z E1204 14:23:16.481000 371114 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4401012Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4401161Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4401437Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4401582Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4401855Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4401968Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4402234Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4402374Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4402641Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4402777Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4403043Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4403168Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4403451Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4403601Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4404175Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4404281Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4404469Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4404930Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4405034Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4405234Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4405387Z E1204 14:23:16.500000 371113 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4405516Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4405667Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4405941Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4406084Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4406357Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4406469Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4406736Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4406875Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4407139Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4407277Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4407542Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4407688Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4407954Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4408090Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4408669Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4408775Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4408961Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4409421Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4409526Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4409726Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4409881Z E1204 14:23:16.510000 371115 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4409920Z FAILED [8.6199s] [100%] 2025-12-04T14:25:34.4409922Z 2025-12-04T14:25:34.4409975Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4410163Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4410246Z Traceback (most recent call last): 2025-12-04T14:25:34.4410407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4410450Z self._join_processes(fn) 2025-12-04T14:25:34.4410619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4410672Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4410847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4410889Z raise RuntimeError(error) 2025-12-04T14:25:34.4410967Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4411010Z Traceback (most recent call last): 2025-12-04T14:25:34.4411169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4411209Z getattr(self, test_name)() 2025-12-04T14:25:34.4411367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4411400Z fn() 2025-12-04T14:25:34.4411548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4411600Z method(*args, **kwargs) 2025-12-04T14:25:34.4411761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4411800Z method(*args, **kwargs) 2025-12-04T14:25:34.4411948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4411985Z with policy(): 2025-12-04T14:25:34.4412135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4412175Z raise RuntimeError(msg) 2025-12-04T14:25:34.4412636Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T14:25:34.4412641Z 2025-12-04T14:25:34.4412715Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4413057Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4413059Z 2025-12-04T14:25:34.4413145Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4413147Z 2025-12-04T14:25:34.4413149Z 2025-12-04T14:25:34.4413223Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4413310Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4413579Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-551ba22b24efc1d9.xml - 2025-12-04T14:25:34.4413640Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4413999Z FAILED [8.6199s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4414043Z Traceback (most recent call last): 2025-12-04T14:25:34.4414206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4414247Z getattr(self, test_name)() 2025-12-04T14:25:34.4414407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4414440Z fn() 2025-12-04T14:25:34.4414591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4414630Z method(*args, **kwargs) 2025-12-04T14:25:34.4414779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4414817Z method(*args, **kwargs) 2025-12-04T14:25:34.4414965Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4415001Z with policy(): 2025-12-04T14:25:34.4415150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4415189Z raise RuntimeError(msg) 2025-12-04T14:25:34.4415627Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T14:25:34.4415648Z 2025-12-04T14:25:34.4415721Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4416062Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4416064Z 2025-12-04T14:25:34.4416149Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4416211Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4416289Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T14:25:34.4416326Z Got exit code 1 2025-12-04T14:25:34.4416619Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4416745Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4416968Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-75a23e1b49a2ef65.xml 2025-12-04T14:25:34.4417024Z ============================= test session starts ============================== 2025-12-04T14:25:34.4417135Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4417174Z cachedir: .pytest_cache 2025-12-04T14:25:34.4417331Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4417376Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4417415Z configfile: pytest.ini 2025-12-04T14:25:34.4417577Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4417933Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4417983Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4418325Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4418381Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4418435Z collected 15 items / 11 deselected / 4 selected 2025-12-04T14:25:34.4418487Z stepcurrent: skipping 11 already run items. 2025-12-04T14:25:34.4418530Z Running 4 items in this shard 2025-12-04T14:25:34.4418532Z 2025-12-04T14:25:34.4418942Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:23:20.247000 371446 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 371515 2025-12-04T14:25:34.4419094Z I1204 14:23:20.248000 371446 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 371516 2025-12-04T14:25:34.4419245Z I1204 14:23:20.248000 371446 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 371517 2025-12-04T14:25:34.4419392Z I1204 14:23:20.249000 371446 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 371518 2025-12-04T14:25:34.4420081Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4420134Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4420856Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4420901Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4421560Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4421601Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4422265Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4422307Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4422800Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4422848Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4423335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4423383Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4423866Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4423912Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4424396Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4424467Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4425138Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4425179Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4425891Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4425933Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4426597Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4426637Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4427301Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4427343Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4427828Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4427886Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4428366Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4428424Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4428903Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4428957Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4429444Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4429510Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4429744Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4429786Z local_shape = tensor.shape 2025-12-04T14:25:34.4430035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4430072Z tensor.shape, 2025-12-04T14:25:34.4430339Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4430381Z local_shape = tensor.shape 2025-12-04T14:25:34.4430610Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4430651Z local_shape = tensor.shape 2025-12-04T14:25:34.4430877Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4430913Z tensor.dtype, 2025-12-04T14:25:34.4431143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4431178Z tensor.shape, 2025-12-04T14:25:34.4431405Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4431441Z tensor.shape, 2025-12-04T14:25:34.4431669Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4431704Z tensor.dtype, 2025-12-04T14:25:34.4431930Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4431965Z tensor.dtype, 2025-12-04T14:25:34.4432192Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4432233Z local_shape = tensor.shape 2025-12-04T14:25:34.4432459Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4432495Z tensor.shape, 2025-12-04T14:25:34.4432722Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4432757Z tensor.dtype, 2025-12-04T14:25:34.4432891Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4433044Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4433327Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4433487Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4433778Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4433891Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4434159Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4434323Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4434589Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4434727Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4434992Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4435118Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4435386Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4435524Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4436079Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4436186Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4436376Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4436840Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4436948Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4437147Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4437303Z E1204 14:23:27.410000 371515 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4437432Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4437582Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4437869Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4438023Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4438296Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4438407Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4438695Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4438833Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4439099Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4439235Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4439502Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4439626Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4439895Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4440032Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4440611Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1256194048 and is now 2826960896. 2025-12-04T14:25:34.4440718Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4440903Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4441367Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4441472Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4441673Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4441829Z E1204 14:23:27.424000 371518 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4441970Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4442138Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4442412Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4442555Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4442857Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4442969Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4443235Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4443373Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4443638Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4443775Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4444042Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4444168Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4444437Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4444574Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4445130Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4445236Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4445424Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4445884Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4445989Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4446189Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4446353Z E1204 14:23:27.512000 371517 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4446494Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4446643Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4446921Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4447082Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4447356Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4447470Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4447734Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4447871Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4448137Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4448274Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4448540Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4448666Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4448932Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4449072Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4449633Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4449741Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4449927Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4450435Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4450557Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4450767Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4450921Z E1204 14:23:27.512000 371516 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4450959Z FAILED [8.3183s] [ 25%] 2025-12-04T14:25:34.4450962Z 2025-12-04T14:25:34.4451017Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4451204Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4451274Z Traceback (most recent call last): 2025-12-04T14:25:34.4451436Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4451478Z self._join_processes(fn) 2025-12-04T14:25:34.4451649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4454408Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4454606Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4454651Z raise RuntimeError(error) 2025-12-04T14:25:34.4454733Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4454778Z Traceback (most recent call last): 2025-12-04T14:25:34.4454942Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4454984Z getattr(self, test_name)() 2025-12-04T14:25:34.4455143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4455181Z fn() 2025-12-04T14:25:34.4455335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4455375Z method(*args, **kwargs) 2025-12-04T14:25:34.4455541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4455580Z method(*args, **kwargs) 2025-12-04T14:25:34.4455729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4455765Z with policy(): 2025-12-04T14:25:34.4455918Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4455957Z raise RuntimeError(msg) 2025-12-04T14:25:34.4456395Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1256194048 and is now 2826960896. 2025-12-04T14:25:34.4456400Z 2025-12-04T14:25:34.4456475Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4456821Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4456824Z 2025-12-04T14:25:34.4456912Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4456917Z 2025-12-04T14:25:34.4456919Z 2025-12-04T14:25:34.4456996Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4457106Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4457394Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-75a23e1b49a2ef65.xml - 2025-12-04T14:25:34.4457457Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4457813Z FAILED [8.3183s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4457859Z Traceback (most recent call last): 2025-12-04T14:25:34.4458042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4458085Z getattr(self, test_name)() 2025-12-04T14:25:34.4458245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4458283Z fn() 2025-12-04T14:25:34.4458481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4458524Z method(*args, **kwargs) 2025-12-04T14:25:34.4458672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4458712Z method(*args, **kwargs) 2025-12-04T14:25:34.4458860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4458896Z with policy(): 2025-12-04T14:25:34.4459048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4459088Z raise RuntimeError(msg) 2025-12-04T14:25:34.4459528Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1256194048 and is now 2826960896. 2025-12-04T14:25:34.4459533Z 2025-12-04T14:25:34.4459606Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4459951Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4459953Z 2025-12-04T14:25:34.4460040Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4460103Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4460166Z ======================= 1 failed, 11 deselected in 8.48s ======================= 2025-12-04T14:25:34.4460272Z Got exit code 1 2025-12-04T14:25:34.4460310Z Retrying single test... 2025-12-04T14:25:34.4460537Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9ce025caa7bf0ce9.xml 2025-12-04T14:25:34.4460594Z ============================= test session starts ============================== 2025-12-04T14:25:34.4460707Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4460746Z cachedir: .pytest_cache 2025-12-04T14:25:34.4460903Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4460948Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4460988Z configfile: pytest.ini 2025-12-04T14:25:34.4461150Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4461542Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4461591Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4461938Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4462009Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4462064Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4462400Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4462444Z Running 1 items in this shard 2025-12-04T14:25:34.4462446Z 2025-12-04T14:25:34.4462881Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:23:31.217000 371848 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 371917 2025-12-04T14:25:34.4463036Z I1204 14:23:31.217000 371848 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 371918 2025-12-04T14:25:34.4463190Z I1204 14:23:31.218000 371848 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 371919 2025-12-04T14:25:34.4463340Z I1204 14:23:31.218000 371848 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 371920 2025-12-04T14:25:34.4464021Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4464067Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4464739Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4464784Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4465449Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4465491Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4466161Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4466228Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4466724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4466781Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4467266Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4467325Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4467812Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4467857Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4468340Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4468387Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4469058Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4469100Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4469768Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4469809Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4470515Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4470555Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4471229Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4471281Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4471775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4471837Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4472332Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4472391Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4472874Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4472929Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4473407Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4473461Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4473695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4473738Z local_shape = tensor.shape 2025-12-04T14:25:34.4473970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4474008Z tensor.shape, 2025-12-04T14:25:34.4474237Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4474274Z tensor.dtype, 2025-12-04T14:25:34.4474506Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4474548Z local_shape = tensor.shape 2025-12-04T14:25:34.4474778Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4474814Z tensor.shape, 2025-12-04T14:25:34.4475046Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4475094Z tensor.dtype, 2025-12-04T14:25:34.4475323Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4475381Z local_shape = tensor.shape 2025-12-04T14:25:34.4475609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4475650Z local_shape = tensor.shape 2025-12-04T14:25:34.4475878Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4475923Z tensor.shape, 2025-12-04T14:25:34.4476152Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4476189Z tensor.shape, 2025-12-04T14:25:34.4476416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4476462Z tensor.dtype, 2025-12-04T14:25:34.4476691Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4476727Z tensor.dtype, 2025-12-04T14:25:34.4476861Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4477017Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4477297Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4477444Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4477723Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4477837Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4478107Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4478247Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4478514Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4478654Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4478921Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4479052Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4479320Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4479469Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4480038Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 803209216 and is now 2826960896. 2025-12-04T14:25:34.4480147Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4480378Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4480858Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4480967Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4481168Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4481325Z E1204 14:23:38.623000 371920 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4481452Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4481603Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4481881Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4482025Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4482300Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4482414Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4482681Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4482820Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4483087Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4483225Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4483490Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4483616Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4483899Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4484050Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4484614Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4484722Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4484909Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4485380Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4485486Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4485689Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4485845Z E1204 14:23:38.630000 371919 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4485972Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4486124Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4486400Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4486544Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4486817Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4486931Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4487199Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4487338Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4487602Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4487741Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4488005Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4488149Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4488418Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4488556Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4489120Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4489228Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4489425Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4489885Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4489990Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4490241Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4490398Z E1204 14:23:38.631000 371917 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4490527Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4490677Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4490953Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4491096Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4491373Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4491489Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4491755Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4491895Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4492161Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4492318Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4492599Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4492725Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4492992Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4493146Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4493711Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4493819Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4494005Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4494470Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4494577Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4494779Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4494933Z E1204 14:23:38.637000 371918 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4494972Z FAILED [8.6204s] [100%] 2025-12-04T14:25:34.4494975Z 2025-12-04T14:25:34.4495030Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4495219Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4495265Z Traceback (most recent call last): 2025-12-04T14:25:34.4495429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4495472Z self._join_processes(fn) 2025-12-04T14:25:34.4495645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4495697Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4495873Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4495916Z raise RuntimeError(error) 2025-12-04T14:25:34.4495995Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4496038Z Traceback (most recent call last): 2025-12-04T14:25:34.4496199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4496240Z getattr(self, test_name)() 2025-12-04T14:25:34.4496408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4496451Z fn() 2025-12-04T14:25:34.4496606Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4496646Z method(*args, **kwargs) 2025-12-04T14:25:34.4496796Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4496836Z method(*args, **kwargs) 2025-12-04T14:25:34.4496983Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4497019Z with policy(): 2025-12-04T14:25:34.4497180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4497221Z raise RuntimeError(msg) 2025-12-04T14:25:34.4497672Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 803209216 and is now 2826960896. 2025-12-04T14:25:34.4497675Z 2025-12-04T14:25:34.4497750Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4498091Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4498093Z 2025-12-04T14:25:34.4498181Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4498183Z 2025-12-04T14:25:34.4498185Z 2025-12-04T14:25:34.4498261Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4498347Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4498619Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9ce025caa7bf0ce9.xml - 2025-12-04T14:25:34.4498681Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4499033Z FAILED [8.6204s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4499080Z Traceback (most recent call last): 2025-12-04T14:25:34.4499244Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4499287Z getattr(self, test_name)() 2025-12-04T14:25:34.4499445Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4499479Z fn() 2025-12-04T14:25:34.4499630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4499670Z method(*args, **kwargs) 2025-12-04T14:25:34.4499820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4499859Z method(*args, **kwargs) 2025-12-04T14:25:34.4500010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4500046Z with policy(): 2025-12-04T14:25:34.4500234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4500287Z raise RuntimeError(msg) 2025-12-04T14:25:34.4500735Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 803209216 and is now 2826960896. 2025-12-04T14:25:34.4500738Z 2025-12-04T14:25:34.4500810Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4501168Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4501170Z 2025-12-04T14:25:34.4501260Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4501325Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4501391Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T14:25:34.4501427Z Got exit code 1 2025-12-04T14:25:34.4501466Z Retrying single test... 2025-12-04T14:25:34.4501707Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-7293483308a4fe17.xml 2025-12-04T14:25:34.4501766Z ============================= test session starts ============================== 2025-12-04T14:25:34.4501878Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4501918Z cachedir: .pytest_cache 2025-12-04T14:25:34.4502076Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4502125Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4502167Z configfile: pytest.ini 2025-12-04T14:25:34.4502331Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4502689Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4502740Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4503087Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4503144Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4503199Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4503541Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4503587Z Running 1 items in this shard 2025-12-04T14:25:34.4503589Z 2025-12-04T14:25:34.4504000Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 14:23:42.599000 372250 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 372319 2025-12-04T14:25:34.4504157Z I1204 14:23:42.600000 372250 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 372320 2025-12-04T14:25:34.4504310Z I1204 14:23:42.601000 372250 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 372321 2025-12-04T14:25:34.4504472Z I1204 14:23:42.601000 372250 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 372322 2025-12-04T14:25:34.4505157Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4505202Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4505875Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4505920Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4506602Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4506645Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4507315Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4507357Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4507851Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4507899Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4508385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4508433Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4508915Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4508966Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4509446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4509513Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4510245Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4510288Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4510966Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4511012Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4511678Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4511719Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4512207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4512268Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4512947Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4512988Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4513473Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4513531Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4514016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4514089Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4514583Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4514639Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T14:25:34.4514884Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4514931Z local_shape = tensor.shape 2025-12-04T14:25:34.4515162Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4515202Z tensor.shape, 2025-12-04T14:25:34.4515444Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4515489Z local_shape = tensor.shape 2025-12-04T14:25:34.4515721Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4515759Z tensor.dtype, 2025-12-04T14:25:34.4515992Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4516030Z tensor.shape, 2025-12-04T14:25:34.4516259Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4516300Z tensor.dtype, 2025-12-04T14:25:34.4516529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4516573Z local_shape = tensor.shape 2025-12-04T14:25:34.4516804Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4516840Z tensor.shape, 2025-12-04T14:25:34.4517070Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4517106Z tensor.dtype, 2025-12-04T14:25:34.4517338Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4517381Z local_shape = tensor.shape 2025-12-04T14:25:34.4517615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4517651Z tensor.shape, 2025-12-04T14:25:34.4517883Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T14:25:34.4517917Z tensor.dtype, 2025-12-04T14:25:34.4518058Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4518213Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4518506Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4518664Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4518943Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4519063Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4519340Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4519483Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4519761Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4519901Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4520205Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4520338Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4520607Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4520750Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4521317Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4521425Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4521617Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4522082Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4522191Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4522394Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4522554Z E1204 14:23:49.935000 372319 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4522686Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4522850Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4523143Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4523287Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4523582Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4523695Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4523962Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4524117Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4524386Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4524529Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4524795Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4524924Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4525192Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4525334Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4525890Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4526000Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4526191Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4526656Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4526763Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4526963Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4527133Z E1204 14:23:49.949000 372320 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4527271Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4527422Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4527696Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4527850Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4528125Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4528239Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4528523Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4528664Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4528935Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4529074Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4529343Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4529472Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4529743Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4529881Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4530474Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896. 2025-12-04T14:25:34.4530584Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4530770Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4531235Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4531342Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4531560Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4531729Z E1204 14:23:49.976000 372322 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4531856Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4532005Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4532294Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4532439Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4532724Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4532840Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4533106Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4533247Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4533512Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4533650Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4533916Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4534043Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4534312Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4534450Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4535004Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4535111Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4535296Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4535759Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4535887Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4536087Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4536242Z E1204 14:23:49.998000 372321 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4536282Z FAILED [8.7202s] [100%] 2025-12-04T14:25:34.4536284Z 2025-12-04T14:25:34.4536337Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4536538Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4536585Z Traceback (most recent call last): 2025-12-04T14:25:34.4536747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4536790Z self._join_processes(fn) 2025-12-04T14:25:34.4536971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4537025Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4537200Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4537244Z raise RuntimeError(error) 2025-12-04T14:25:34.4537322Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4537368Z Traceback (most recent call last): 2025-12-04T14:25:34.4537527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4537569Z getattr(self, test_name)() 2025-12-04T14:25:34.4537727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4537763Z fn() 2025-12-04T14:25:34.4537911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4537951Z method(*args, **kwargs) 2025-12-04T14:25:34.4538099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4538138Z method(*args, **kwargs) 2025-12-04T14:25:34.4538286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4538322Z with policy(): 2025-12-04T14:25:34.4538471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4538513Z raise RuntimeError(msg) 2025-12-04T14:25:34.4538954Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4538956Z 2025-12-04T14:25:34.4539031Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4539373Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4539377Z 2025-12-04T14:25:34.4539463Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4539476Z 2025-12-04T14:25:34.4539534Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4539593Z Traceback (most recent call last): 2025-12-04T14:25:34.4539754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4539795Z getattr(self, test_name)() 2025-12-04T14:25:34.4539951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4539984Z fn() 2025-12-04T14:25:34.4540132Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4540198Z method(*args, **kwargs) 2025-12-04T14:25:34.4540359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4540398Z method(*args, **kwargs) 2025-12-04T14:25:34.4540546Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4540582Z with policy(): 2025-12-04T14:25:34.4540744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4540785Z raise RuntimeError(msg) 2025-12-04T14:25:34.4541224Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4541227Z 2025-12-04T14:25:34.4541300Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4541640Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4541644Z 2025-12-04T14:25:34.4541730Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4541733Z 2025-12-04T14:25:34.4541790Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4541836Z Traceback (most recent call last): 2025-12-04T14:25:34.4541995Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4542036Z getattr(self, test_name)() 2025-12-04T14:25:34.4542193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4542227Z fn() 2025-12-04T14:25:34.4542375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4542415Z method(*args, **kwargs) 2025-12-04T14:25:34.4542562Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4542602Z method(*args, **kwargs) 2025-12-04T14:25:34.4542750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4542786Z with policy(): 2025-12-04T14:25:34.4542936Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4542976Z raise RuntimeError(msg) 2025-12-04T14:25:34.4543414Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896. 2025-12-04T14:25:34.4543432Z 2025-12-04T14:25:34.4543517Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4543858Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4543861Z 2025-12-04T14:25:34.4543946Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4543948Z 2025-12-04T14:25:34.4543950Z 2025-12-04T14:25:34.4544026Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4544123Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4544393Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-7293483308a4fe17.xml - 2025-12-04T14:25:34.4544454Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4544816Z FAILED [8.7202s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4544862Z Traceback (most recent call last): 2025-12-04T14:25:34.4545025Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4545068Z getattr(self, test_name)() 2025-12-04T14:25:34.4545226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4545259Z fn() 2025-12-04T14:25:34.4545408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4545448Z method(*args, **kwargs) 2025-12-04T14:25:34.4545598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4545637Z method(*args, **kwargs) 2025-12-04T14:25:34.4545786Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4545821Z with policy(): 2025-12-04T14:25:34.4545970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4546011Z raise RuntimeError(msg) 2025-12-04T14:25:34.4546451Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T14:25:34.4546454Z 2025-12-04T14:25:34.4546528Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4546869Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4546872Z 2025-12-04T14:25:34.4546958Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4546960Z 2025-12-04T14:25:34.4547017Z Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4547061Z Traceback (most recent call last): 2025-12-04T14:25:34.4547221Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4547276Z getattr(self, test_name)() 2025-12-04T14:25:34.4547432Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4547478Z fn() 2025-12-04T14:25:34.4547628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4547670Z method(*args, **kwargs) 2025-12-04T14:25:34.4547817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4547857Z method(*args, **kwargs) 2025-12-04T14:25:34.4548004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4548050Z with policy(): 2025-12-04T14:25:34.4548200Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4548241Z raise RuntimeError(msg) 2025-12-04T14:25:34.4548690Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T14:25:34.4548693Z 2025-12-04T14:25:34.4548766Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4549107Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4549109Z 2025-12-04T14:25:34.4549193Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4549196Z 2025-12-04T14:25:34.4549255Z Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4549299Z Traceback (most recent call last): 2025-12-04T14:25:34.4549460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4549501Z getattr(self, test_name)() 2025-12-04T14:25:34.4549657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4549691Z fn() 2025-12-04T14:25:34.4549839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4549877Z method(*args, **kwargs) 2025-12-04T14:25:34.4550026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4550063Z method(*args, **kwargs) 2025-12-04T14:25:34.4550243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4550279Z with policy(): 2025-12-04T14:25:34.4550429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4550471Z raise RuntimeError(msg) 2025-12-04T14:25:34.4550906Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896. 2025-12-04T14:25:34.4550908Z 2025-12-04T14:25:34.4550982Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4551322Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4551352Z 2025-12-04T14:25:34.4551437Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4551502Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4551564Z ======================= 1 failed, 14 deselected in 8.86s ======================= 2025-12-04T14:25:34.4551599Z Got exit code 1 2025-12-04T14:25:34.4551890Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4552032Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4552259Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-e489aad8957daadf.xml 2025-12-04T14:25:34.4552319Z ============================= test session starts ============================== 2025-12-04T14:25:34.4552444Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4552486Z cachedir: .pytest_cache 2025-12-04T14:25:34.4552642Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4552687Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4552726Z configfile: pytest.ini 2025-12-04T14:25:34.4552889Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4553253Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4553305Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4553651Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4553709Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4553764Z collected 15 items / 12 deselected / 3 selected 2025-12-04T14:25:34.4553815Z stepcurrent: skipping 12 already run items. 2025-12-04T14:25:34.4556524Z Running 3 items in this shard 2025-12-04T14:25:34.4556528Z 2025-12-04T14:25:34.4556913Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 14:23:54.047000 372652 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 372721 2025-12-04T14:25:34.4557073Z I1204 14:23:54.048000 372652 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 372722 2025-12-04T14:25:34.4557228Z I1204 14:23:54.049000 372652 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 372723 2025-12-04T14:25:34.4557377Z I1204 14:23:54.049000 372652 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 372724 2025-12-04T14:25:34.4558056Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4558139Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4558827Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4558870Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4559544Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4559587Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4560300Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4560342Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4560840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4560893Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4561381Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4561427Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4561917Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4561963Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4562446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4562492Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4562754Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4562908Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4563209Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4563378Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4563655Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4563770Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4564051Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4564193Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4564480Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4564618Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4564885Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4565013Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4565288Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4565429Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4565940Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4566049Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4566238Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4566657Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4566764Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4566967Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4567125Z E1204 14:24:01.322000 372722 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4567254Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4567417Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4567710Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4567854Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4568136Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4568251Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4568518Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4568666Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4568932Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4569070Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4569335Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4569461Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4569730Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4569868Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4570413Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1251999744 and is now 2587885568. 2025-12-04T14:25:34.4570520Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4570709Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4571121Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4571226Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4571428Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4571583Z E1204 14:24:01.333000 372724 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4571726Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4571889Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4572166Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4572309Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4572602Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4572717Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4573001Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4573139Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4573404Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4573543Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4573807Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4573934Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4574201Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4574339Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4574848Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4574955Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4575144Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4575559Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4575667Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4575869Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4576039Z E1204 14:24:01.342000 372721 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4576179Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4576328Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4576603Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4576756Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4577031Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4577144Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4577420Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4577559Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4577826Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4577969Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4578233Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4578360Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4578626Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4578765Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4579268Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4579377Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4579563Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4579974Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4580080Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4580328Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4580500Z E1204 14:24:01.359000 372723 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4580539Z FAILED [8.5214s] [ 33%] 2025-12-04T14:25:34.4580541Z 2025-12-04T14:25:34.4580599Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4580742Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4580787Z Traceback (most recent call last): 2025-12-04T14:25:34.4580962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4581010Z self._join_processes(fn) 2025-12-04T14:25:34.4581182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4581237Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4581425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4581470Z raise RuntimeError(error) 2025-12-04T14:25:34.4581550Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.4581594Z Traceback (most recent call last): 2025-12-04T14:25:34.4581754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4581796Z getattr(self, test_name)() 2025-12-04T14:25:34.4581952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4581987Z fn() 2025-12-04T14:25:34.4582135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4582177Z method(*args, **kwargs) 2025-12-04T14:25:34.4582326Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4582366Z method(*args, **kwargs) 2025-12-04T14:25:34.4582517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4582553Z with policy(): 2025-12-04T14:25:34.4582704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4582744Z raise RuntimeError(msg) 2025-12-04T14:25:34.4583136Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4583140Z 2025-12-04T14:25:34.4583214Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4583507Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4583509Z 2025-12-04T14:25:34.4583594Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4583596Z 2025-12-04T14:25:34.4583598Z 2025-12-04T14:25:34.4583675Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4583762Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4584034Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-e489aad8957daadf.xml - 2025-12-04T14:25:34.4584122Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4584431Z FAILED [8.5214s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.4584477Z Traceback (most recent call last): 2025-12-04T14:25:34.4584639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4584681Z getattr(self, test_name)() 2025-12-04T14:25:34.4584848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4584883Z fn() 2025-12-04T14:25:34.4585033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4585074Z method(*args, **kwargs) 2025-12-04T14:25:34.4585232Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4585272Z method(*args, **kwargs) 2025-12-04T14:25:34.4585419Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4585456Z with policy(): 2025-12-04T14:25:34.4585606Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4585646Z raise RuntimeError(msg) 2025-12-04T14:25:34.4586035Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4586039Z 2025-12-04T14:25:34.4586112Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4586405Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4586408Z 2025-12-04T14:25:34.4586492Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4586555Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4586618Z ======================= 1 failed, 12 deselected in 8.66s ======================= 2025-12-04T14:25:34.4586654Z Got exit code 1 2025-12-04T14:25:34.4586693Z Retrying single test... 2025-12-04T14:25:34.4586915Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c24760d8952bbc7b.xml 2025-12-04T14:25:34.4586973Z ============================= test session starts ============================== 2025-12-04T14:25:34.4587087Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4587126Z cachedir: .pytest_cache 2025-12-04T14:25:34.4587282Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4587326Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4587366Z configfile: pytest.ini 2025-12-04T14:25:34.4587530Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4587889Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4587959Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4588300Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4588356Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4588411Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4588714Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4588759Z Running 1 items in this shard 2025-12-04T14:25:34.4588761Z 2025-12-04T14:25:34.4589132Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 14:24:05.286000 373054 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 373123 2025-12-04T14:25:34.4589296Z I1204 14:24:05.287000 373054 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 373124 2025-12-04T14:25:34.4589448Z I1204 14:24:05.288000 373054 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 373125 2025-12-04T14:25:34.4589596Z I1204 14:24:05.289000 373054 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 373126 2025-12-04T14:25:34.4590320Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4590366Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4591037Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4591082Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4591748Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4591791Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4592457Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4592512Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4593011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4593075Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4593577Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4593623Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4594126Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4594173Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4594657Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4594702Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4594835Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4594989Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4595269Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4595420Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4595696Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4595811Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4596079Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4596221Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4596491Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4596632Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4596900Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4597036Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4597320Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4597458Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4597978Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1254096896 and is now 2587885568. 2025-12-04T14:25:34.4598088Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4598276Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4598777Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4598884Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4599090Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4599246Z E1204 14:24:12.619000 373126 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4599381Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4599531Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4599808Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4599953Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4600263Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4600379Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4600645Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4600784Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4601049Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4601188Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4601454Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4601612Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4601880Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4602019Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4602538Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4602645Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4602844Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4603255Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4603361Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4603561Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4603720Z E1204 14:24:12.621000 373125 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4603850Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4603999Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4604274Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4604417Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4604692Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4604807Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4605074Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4605213Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4605478Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4605629Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4605910Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4606036Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4606303Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4606453Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4606966Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4607074Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4607261Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4607674Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4607781Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4607982Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4608137Z E1204 14:24:12.633000 373124 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4608265Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4608415Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4608692Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4608836Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4609112Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4609225Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4609494Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4609637Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4609904Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4610061Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4610355Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4610483Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4610769Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4610909Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4611426Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4611533Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4611719Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4612130Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4612237Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4612437Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4612592Z E1204 14:24:12.635000 373123 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4612629Z FAILED [8.6203s] [100%] 2025-12-04T14:25:34.4612631Z 2025-12-04T14:25:34.4612686Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4612827Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4612873Z Traceback (most recent call last): 2025-12-04T14:25:34.4613034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4613079Z self._join_processes(fn) 2025-12-04T14:25:34.4613250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4613304Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4613482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4613526Z raise RuntimeError(error) 2025-12-04T14:25:34.4613604Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4613648Z Traceback (most recent call last): 2025-12-04T14:25:34.4613807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4613864Z getattr(self, test_name)() 2025-12-04T14:25:34.4614019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4614069Z fn() 2025-12-04T14:25:34.4614218Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4614259Z method(*args, **kwargs) 2025-12-04T14:25:34.4614407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4614447Z method(*args, **kwargs) 2025-12-04T14:25:34.4614594Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4614632Z with policy(): 2025-12-04T14:25:34.4614792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4614834Z raise RuntimeError(msg) 2025-12-04T14:25:34.4615231Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4615236Z 2025-12-04T14:25:34.4615313Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4615605Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4615607Z 2025-12-04T14:25:34.4615693Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4615695Z 2025-12-04T14:25:34.4615697Z 2025-12-04T14:25:34.4615773Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4615861Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4616131Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c24760d8952bbc7b.xml - 2025-12-04T14:25:34.4616190Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4616499Z FAILED [8.6203s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4616544Z Traceback (most recent call last): 2025-12-04T14:25:34.4616707Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4616749Z getattr(self, test_name)() 2025-12-04T14:25:34.4616906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4616941Z fn() 2025-12-04T14:25:34.4617091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4617130Z method(*args, **kwargs) 2025-12-04T14:25:34.4617279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4617318Z method(*args, **kwargs) 2025-12-04T14:25:34.4617464Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4617502Z with policy(): 2025-12-04T14:25:34.4617654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4617693Z raise RuntimeError(msg) 2025-12-04T14:25:34.4618094Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4618108Z 2025-12-04T14:25:34.4618181Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4618471Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4618473Z 2025-12-04T14:25:34.4618569Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4618633Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4618693Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T14:25:34.4618731Z Got exit code 1 2025-12-04T14:25:34.4618770Z Retrying single test... 2025-12-04T14:25:34.4619003Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-37f933052d2d9aec.xml 2025-12-04T14:25:34.4619060Z ============================= test session starts ============================== 2025-12-04T14:25:34.4619171Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4619213Z cachedir: .pytest_cache 2025-12-04T14:25:34.4619368Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4619414Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4619453Z configfile: pytest.ini 2025-12-04T14:25:34.4619615Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4619975Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4620026Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4620412Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4620470Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4620525Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4620816Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4620859Z Running 1 items in this shard 2025-12-04T14:25:34.4620861Z 2025-12-04T14:25:34.4621237Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 14:24:16.406000 373456 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 373525 2025-12-04T14:25:34.4621390Z I1204 14:24:16.407000 373456 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 373526 2025-12-04T14:25:34.4621541Z I1204 14:24:16.407000 373456 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 373527 2025-12-04T14:25:34.4621691Z I1204 14:24:16.408000 373456 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 373528 2025-12-04T14:25:34.4622363Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4622443Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4623114Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4623158Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4623840Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4623881Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4624546Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4624590Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4625091Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4625140Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4625626Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4625674Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4626157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4626202Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4626687Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4626752Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4626887Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4627041Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4627324Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4627483Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4627761Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4627876Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4628153Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4628292Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4628558Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4628699Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4628963Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4629093Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4629359Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4629497Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4630008Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4630118Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4630341Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4630758Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4630865Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4631079Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4631248Z E1204 14:24:23.702000 373525 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4631376Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4631526Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4631812Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4631957Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4632232Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4632359Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4632625Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4632762Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4633028Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4633166Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4633434Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4633561Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4633830Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4633970Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4634479Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4634586Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4634773Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4635186Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4635300Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4635514Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4635670Z E1204 14:24:23.721000 373527 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4635798Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4635948Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4636232Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4636377Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4636666Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4636780Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4637045Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4637184Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4637449Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4637588Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4637853Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4637979Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4638246Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4638384Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4638894Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4638999Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4639185Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4639595Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4639792Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4639995Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4640149Z E1204 14:24:23.769000 373526 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4640312Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4640476Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4640752Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4640911Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4641190Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4641303Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4641569Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4641707Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4641974Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4642112Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4642376Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4642503Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4642770Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4642910Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4643417Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 958398464 and is now 2587885568. 2025-12-04T14:25:34.4643522Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4643716Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4644142Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4644261Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4644460Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4644614Z E1204 14:24:23.785000 373528 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4644661Z FAILED [8.5196s] [100%] 2025-12-04T14:25:34.4644664Z 2025-12-04T14:25:34.4644718Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4644864Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _ 2025-12-04T14:25:34.4644911Z Traceback (most recent call last): 2025-12-04T14:25:34.4645081Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4645125Z self._join_processes(fn) 2025-12-04T14:25:34.4645297Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4645350Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4645527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4645569Z raise RuntimeError(error) 2025-12-04T14:25:34.4645648Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4645691Z Traceback (most recent call last): 2025-12-04T14:25:34.4645853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4645894Z getattr(self, test_name)() 2025-12-04T14:25:34.4646053Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4646088Z fn() 2025-12-04T14:25:34.4646239Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4646278Z method(*args, **kwargs) 2025-12-04T14:25:34.4646428Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4646467Z method(*args, **kwargs) 2025-12-04T14:25:34.4646617Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4646655Z with policy(): 2025-12-04T14:25:34.4646806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4646846Z raise RuntimeError(msg) 2025-12-04T14:25:34.4647239Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4647242Z 2025-12-04T14:25:34.4647316Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4647608Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4647611Z 2025-12-04T14:25:34.4647707Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4647710Z 2025-12-04T14:25:34.4647721Z 2025-12-04T14:25:34.4647794Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4647881Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4648150Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-37f933052d2d9aec.xml - 2025-12-04T14:25:34.4648211Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4648527Z FAILED [8.5196s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4648573Z Traceback (most recent call last): 2025-12-04T14:25:34.4648736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4648780Z getattr(self, test_name)() 2025-12-04T14:25:34.4648946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4648982Z fn() 2025-12-04T14:25:34.4649131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4649171Z method(*args, **kwargs) 2025-12-04T14:25:34.4649319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4649358Z method(*args, **kwargs) 2025-12-04T14:25:34.4649507Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4649542Z with policy(): 2025-12-04T14:25:34.4649694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4649735Z raise RuntimeError(msg) 2025-12-04T14:25:34.4650123Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4650126Z 2025-12-04T14:25:34.4650264Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4650557Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4650559Z 2025-12-04T14:25:34.4650644Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4650708Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4650770Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T14:25:34.4650806Z Got exit code 1 2025-12-04T14:25:34.4651048Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T14:25:34.4651174Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4651398Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-7dc9bcfb08f720d2.xml 2025-12-04T14:25:34.4651455Z ============================= test session starts ============================== 2025-12-04T14:25:34.4651568Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4651622Z cachedir: .pytest_cache 2025-12-04T14:25:34.4651794Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4651838Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4651878Z configfile: pytest.ini 2025-12-04T14:25:34.4652038Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4652420Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4652470Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4652813Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4652870Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4652937Z collected 15 items / 13 deselected / 2 selected 2025-12-04T14:25:34.4652988Z stepcurrent: skipping 13 already run items. 2025-12-04T14:25:34.4653030Z Running 2 items in this shard 2025-12-04T14:25:34.4653032Z 2025-12-04T14:25:34.4653396Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 14:24:27.391000 373858 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 373927 2025-12-04T14:25:34.4653551Z I1204 14:24:27.392000 373858 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 373928 2025-12-04T14:25:34.4653700Z I1204 14:24:27.393000 373858 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 373929 2025-12-04T14:25:34.4653851Z I1204 14:24:27.393000 373858 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 373930 2025-12-04T14:25:34.4654525Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4654567Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4655228Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4655270Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4655932Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4655974Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4657529Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4657583Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4658087Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4658136Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4658631Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4658679Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4659162Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4659207Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4659695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4659740Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4659875Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4660029Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4660349Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4660498Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4660776Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4660891Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4661159Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4661301Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4661567Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4661734Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4662001Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4662127Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4662407Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4662546Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4663070Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4663177Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4663365Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4663779Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4663888Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4664089Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4664245Z E1204 14:24:34.621000 373930 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4664374Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4664523Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4664801Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4664949Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4665225Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4665339Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4665607Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4665745Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4666030Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4666168Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4666433Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4666568Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4666835Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4666975Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4667490Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4667596Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4667784Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4668195Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4668301Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4668501Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4668656Z E1204 14:24:34.637000 373927 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4668783Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4668933Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4669210Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4669357Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4669631Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4669743Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4670009Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4670204Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4670471Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4670608Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4670887Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4671014Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4671281Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4671434Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4671939Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4672044Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4672230Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4672642Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4672747Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4672946Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4673102Z E1204 14:24:34.638000 373929 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4673229Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4673380Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4673655Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4673798Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4674073Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4674187Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4674468Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4674621Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4674888Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4675025Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4675301Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4675428Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4675708Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4675846Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4676349Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4676456Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4676645Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4677054Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4677160Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4677361Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4677516Z E1204 14:24:34.693000 373928 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4677556Z FAILED [8.4205s] [ 50%] 2025-12-04T14:25:34.4677559Z 2025-12-04T14:25:34.4677612Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4677752Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4677797Z Traceback (most recent call last): 2025-12-04T14:25:34.4677957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4678000Z self._join_processes(fn) 2025-12-04T14:25:34.4678172Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4678225Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4678400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4678466Z raise RuntimeError(error) 2025-12-04T14:25:34.4678544Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4678589Z Traceback (most recent call last): 2025-12-04T14:25:34.4678749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4678790Z getattr(self, test_name)() 2025-12-04T14:25:34.4678946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4678980Z fn() 2025-12-04T14:25:34.4679137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4679178Z method(*args, **kwargs) 2025-12-04T14:25:34.4679327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4679368Z method(*args, **kwargs) 2025-12-04T14:25:34.4679515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4679562Z with policy(): 2025-12-04T14:25:34.4679711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4679751Z raise RuntimeError(msg) 2025-12-04T14:25:34.4680137Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4680140Z 2025-12-04T14:25:34.4680251Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4680547Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4680551Z 2025-12-04T14:25:34.4680638Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4680640Z 2025-12-04T14:25:34.4680641Z 2025-12-04T14:25:34.4680717Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4680804Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4681073Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-7dc9bcfb08f720d2.xml - 2025-12-04T14:25:34.4681132Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4681436Z FAILED [8.4205s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T14:25:34.4681482Z Traceback (most recent call last): 2025-12-04T14:25:34.4681645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4681688Z getattr(self, test_name)() 2025-12-04T14:25:34.4681847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4681880Z fn() 2025-12-04T14:25:34.4682032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4682071Z method(*args, **kwargs) 2025-12-04T14:25:34.4682219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4682272Z method(*args, **kwargs) 2025-12-04T14:25:34.4682436Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4682472Z with policy(): 2025-12-04T14:25:34.4682623Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4682663Z raise RuntimeError(msg) 2025-12-04T14:25:34.4683062Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4683064Z 2025-12-04T14:25:34.4683139Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4683429Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4683432Z 2025-12-04T14:25:34.4683529Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4683592Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4683653Z ======================= 1 failed, 13 deselected in 8.58s ======================= 2025-12-04T14:25:34.4683689Z Got exit code 1 2025-12-04T14:25:34.4683728Z Retrying single test... 2025-12-04T14:25:34.4683954Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-edbe14b2f22d7181.xml 2025-12-04T14:25:34.4684012Z ============================= test session starts ============================== 2025-12-04T14:25:34.4684124Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4684165Z cachedir: .pytest_cache 2025-12-04T14:25:34.4684321Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4684367Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4684407Z configfile: pytest.ini 2025-12-04T14:25:34.4684567Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4684924Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4684974Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4685317Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4685374Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4685429Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4685714Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4685757Z Running 1 items in this shard 2025-12-04T14:25:34.4685759Z 2025-12-04T14:25:34.4686121Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 14:24:38.318000 374260 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 374329 2025-12-04T14:25:34.4686294Z I1204 14:24:38.319000 374260 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 374330 2025-12-04T14:25:34.4686454Z I1204 14:24:38.320000 374260 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 374331 2025-12-04T14:25:34.4686603Z I1204 14:24:38.320000 374260 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 374332 2025-12-04T14:25:34.4687284Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4687328Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4687997Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4688040Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4688713Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4688755Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4689420Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4689462Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4689959Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4690009Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4690525Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4690573Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4691055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4691126Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4691610Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4691654Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4691788Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4691953Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4692233Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4692393Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4692672Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4692787Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4693055Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4693195Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4693462Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4693601Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4693866Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4693994Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4694263Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4694404Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4694913Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4695021Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4695209Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4695629Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4695746Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4695948Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4696102Z E1204 14:24:45.649000 374329 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4696240Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4696391Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4696678Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4696821Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4697097Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4697210Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4697480Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4697620Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4697888Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4698027Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4698294Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4698420Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4698688Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4698829Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4699334Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4699440Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4699646Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4700068Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4700208Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4700425Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4700582Z E1204 14:24:45.658000 374331 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4700711Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4700862Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4701160Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4701305Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4701579Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4701690Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4701956Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4702098Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4702363Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4702501Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4702765Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4702891Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4703161Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4703300Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4703803Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4703923Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4704121Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4704529Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4704635Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4704848Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4705004Z E1204 14:24:45.674000 374330 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4705132Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4705292Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4705567Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4705713Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4705985Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4706100Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4706365Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4706503Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4706769Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4706906Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4707172Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4707301Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4707568Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4707705Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4708207Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 803209216 and is now 2587885568. 2025-12-04T14:25:34.4708331Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4708518Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4708938Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4709044Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4709244Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4709400Z E1204 14:24:45.689000 374332 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4709448Z FAILED [8.7192s] [100%] 2025-12-04T14:25:34.4709450Z 2025-12-04T14:25:34.4709506Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4709644Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4709688Z Traceback (most recent call last): 2025-12-04T14:25:34.4709854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4709896Z self._join_processes(fn) 2025-12-04T14:25:34.4710070Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4710123Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4710329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4710373Z raise RuntimeError(error) 2025-12-04T14:25:34.4710452Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.4710496Z Traceback (most recent call last): 2025-12-04T14:25:34.4710654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4710696Z getattr(self, test_name)() 2025-12-04T14:25:34.4710852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4710886Z fn() 2025-12-04T14:25:34.4711034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4711076Z method(*args, **kwargs) 2025-12-04T14:25:34.4711225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4711266Z method(*args, **kwargs) 2025-12-04T14:25:34.4711414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4711450Z with policy(): 2025-12-04T14:25:34.4711599Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4711641Z raise RuntimeError(msg) 2025-12-04T14:25:34.4712026Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4712043Z 2025-12-04T14:25:34.4712132Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4712421Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4712423Z 2025-12-04T14:25:34.4712509Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4712512Z 2025-12-04T14:25:34.4712513Z 2025-12-04T14:25:34.4712588Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4712685Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4712955Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-edbe14b2f22d7181.xml - 2025-12-04T14:25:34.4713015Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4713333Z FAILED [8.7192s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T14:25:34.4713378Z Traceback (most recent call last): 2025-12-04T14:25:34.4713541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4713582Z getattr(self, test_name)() 2025-12-04T14:25:34.4713741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4713774Z fn() 2025-12-04T14:25:34.4713996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4714037Z method(*args, **kwargs) 2025-12-04T14:25:34.4714188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4714226Z method(*args, **kwargs) 2025-12-04T14:25:34.4714376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4714412Z with policy(): 2025-12-04T14:25:34.4714564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4714604Z raise RuntimeError(msg) 2025-12-04T14:25:34.4714991Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4714994Z 2025-12-04T14:25:34.4715068Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4715358Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4715361Z 2025-12-04T14:25:34.4715446Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4715508Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4715568Z ======================= 1 failed, 14 deselected in 8.88s ======================= 2025-12-04T14:25:34.4715604Z Got exit code 1 2025-12-04T14:25:34.4715645Z Retrying single test... 2025-12-04T14:25:34.4715868Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-405eb8bd3a7e40b2.xml 2025-12-04T14:25:34.4715940Z ============================= test session starts ============================== 2025-12-04T14:25:34.4716060Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4716101Z cachedir: .pytest_cache 2025-12-04T14:25:34.4716256Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4716300Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4716339Z configfile: pytest.ini 2025-12-04T14:25:34.4716501Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4716874Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4716926Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4717283Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4717339Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4717394Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4717677Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4717721Z Running 1 items in this shard 2025-12-04T14:25:34.4717723Z 2025-12-04T14:25:34.4718084Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 14:24:49.445000 374662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 374731 2025-12-04T14:25:34.4718240Z I1204 14:24:49.446000 374662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 374732 2025-12-04T14:25:34.4718389Z I1204 14:24:49.446000 374662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 374733 2025-12-04T14:25:34.4718537Z I1204 14:24:49.447000 374662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 374734 2025-12-04T14:25:34.4719206Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4719252Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4719916Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4719957Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4720650Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4720719Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4721403Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4721446Z FSDP.set_state_dict_type( 2025-12-04T14:25:34.4721952Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4722002Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4722485Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4722532Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4723014Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4723061Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4723547Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T14:25:34.4723592Z device = _get_pg_default_device(group) 2025-12-04T14:25:34.4723726Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4723879Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4724161Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4724305Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4724581Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4724697Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4724964Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4725127Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4725393Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4725533Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4725806Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4725934Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4726216Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4726355Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4726863Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4726970Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4727159Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4727569Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4727676Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4727877Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4728032Z E1204 14:24:56.751000 374731 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4728162Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4728312Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4728589Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4728732Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4729009Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4729122Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4729410Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4729547Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4729813Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4729960Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4730270Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4730400Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4730678Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4730818Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4731320Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 952107008 and is now 2587885568. 2025-12-04T14:25:34.4731427Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4731616Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4732024Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4732129Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4732332Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4732490Z E1204 14:24:56.754000 374734 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4732618Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4732773Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4733047Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4733192Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4733466Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4733610Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4733876Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4734014Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4734292Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4734429Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4734695Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4734839Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4735115Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4735256Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4735768Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4735877Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4736064Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4736477Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4736582Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4736785Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4736944Z E1204 14:24:56.760000 374733 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4737072Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4737222Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4737500Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4737645Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4737927Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4738051Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4738316Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4738454Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4738730Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4738869Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4739146Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4739271Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4739539Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4739678Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4740213Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4740320Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4740507Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4740918Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4741023Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4741225Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4741379Z E1204 14:24:56.764000 374732 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4741417Z FAILED [8.5195s] [100%] 2025-12-04T14:25:34.4741419Z 2025-12-04T14:25:34.4741473Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4741611Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _ 2025-12-04T14:25:34.4741656Z Traceback (most recent call last): 2025-12-04T14:25:34.4741817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4741876Z self._join_processes(fn) 2025-12-04T14:25:34.4742061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4742114Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4742291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4742333Z raise RuntimeError(error) 2025-12-04T14:25:34.4742413Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4742456Z Traceback (most recent call last): 2025-12-04T14:25:34.4742631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4742676Z getattr(self, test_name)() 2025-12-04T14:25:34.4742833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4742870Z fn() 2025-12-04T14:25:34.4743020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4743075Z method(*args, **kwargs) 2025-12-04T14:25:34.4743226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4743267Z method(*args, **kwargs) 2025-12-04T14:25:34.4743414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4743450Z with policy(): 2025-12-04T14:25:34.4743599Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4743639Z raise RuntimeError(msg) 2025-12-04T14:25:34.4744025Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4744029Z 2025-12-04T14:25:34.4744106Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4744394Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4744396Z 2025-12-04T14:25:34.4744483Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4744485Z 2025-12-04T14:25:34.4744487Z 2025-12-04T14:25:34.4744561Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4744647Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4744919Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-405eb8bd3a7e40b2.xml - 2025-12-04T14:25:34.4744980Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4745284Z FAILED [8.5195s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4745328Z Traceback (most recent call last): 2025-12-04T14:25:34.4745493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4745533Z getattr(self, test_name)() 2025-12-04T14:25:34.4745691Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4745735Z fn() 2025-12-04T14:25:34.4745900Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4745939Z method(*args, **kwargs) 2025-12-04T14:25:34.4746090Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4746128Z method(*args, **kwargs) 2025-12-04T14:25:34.4746278Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4746313Z with policy(): 2025-12-04T14:25:34.4746474Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4746514Z raise RuntimeError(msg) 2025-12-04T14:25:34.4746900Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T14:25:34.4746904Z 2025-12-04T14:25:34.4746987Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4747279Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4747282Z 2025-12-04T14:25:34.4747367Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4747430Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4747493Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T14:25:34.4747528Z Got exit code 1 2025-12-04T14:25:34.4747769Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T14:25:34.4747896Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4748121Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-acb8c696c5dff142.xml 2025-12-04T14:25:34.4748178Z ============================= test session starts ============================== 2025-12-04T14:25:34.4748290Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4748330Z cachedir: .pytest_cache 2025-12-04T14:25:34.4748487Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4748532Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4748572Z configfile: pytest.ini 2025-12-04T14:25:34.4748732Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4749092Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4749140Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4749484Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4749540Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4749595Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4749662Z stepcurrent: skipping 14 already run items. 2025-12-04T14:25:34.4749715Z Running 1 items in this shard 2025-12-04T14:25:34.4749717Z 2025-12-04T14:25:34.4750048Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 14:25:00.453000 375064 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 375133 2025-12-04T14:25:34.4750235Z I1204 14:25:00.454000 375064 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 375134 2025-12-04T14:25:34.4750386Z I1204 14:25:00.454000 375064 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 375135 2025-12-04T14:25:34.4750552Z I1204 14:25:00.455000 375064 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 375136 2025-12-04T14:25:34.4751278Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4751371Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4752072Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4752162Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4752857Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4752946Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4753644Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4753730Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4753863Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4754013Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4754291Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4754452Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4754744Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4754858Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4755125Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4755276Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4755545Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4755696Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4755964Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4756091Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4756359Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4756497Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4756977Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T14:25:34.4757085Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4757273Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4757650Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4757759Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4757963Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4758120Z E1204 14:25:07.759000 375133 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4758248Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4758400Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4758676Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4758839Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4759114Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4759226Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4759501Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4759639Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4759907Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4760054Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4760354Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4760485Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4760752Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4760894Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4761363Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4761470Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4761657Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4762032Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4762140Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4762343Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4762498Z E1204 14:25:07.777000 375134 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4762626Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4762782Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4763071Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4763228Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4763501Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4763613Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4763893Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4764033Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4764319Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4764459Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4764727Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4764854Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4765126Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4765266Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4765736Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4765843Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4766029Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4766403Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4766509Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4766711Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4766865Z E1204 14:25:07.782000 375135 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4766993Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4767143Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4767443Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4767587Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4767859Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4767981Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4768247Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4768387Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4768661Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4768801Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4769068Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4769194Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4769463Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4769602Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4770073Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568. 2025-12-04T14:25:34.4770214Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4770402Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4770774Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4770880Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4771079Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4771233Z E1204 14:25:07.782000 375136 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4771272Z FAILED [8.6183s] [100%] 2025-12-04T14:25:34.4771291Z 2025-12-04T14:25:34.4771345Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4771464Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____ 2025-12-04T14:25:34.4771509Z Traceback (most recent call last): 2025-12-04T14:25:34.4771671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4771715Z self._join_processes(fn) 2025-12-04T14:25:34.4771888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4771939Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4772128Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4772169Z raise RuntimeError(error) 2025-12-04T14:25:34.4772248Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4772292Z Traceback (most recent call last): 2025-12-04T14:25:34.4772454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4772506Z getattr(self, test_name)() 2025-12-04T14:25:34.4772666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4772699Z fn() 2025-12-04T14:25:34.4772852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4772891Z method(*args, **kwargs) 2025-12-04T14:25:34.4773042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4773081Z method(*args, **kwargs) 2025-12-04T14:25:34.4773229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4773266Z with policy(): 2025-12-04T14:25:34.4773417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4773459Z raise RuntimeError(msg) 2025-12-04T14:25:34.4773815Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T14:25:34.4773817Z 2025-12-04T14:25:34.4773892Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4774148Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4774151Z 2025-12-04T14:25:34.4774239Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4774242Z 2025-12-04T14:25:34.4774243Z 2025-12-04T14:25:34.4774317Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4774406Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4774677Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-acb8c696c5dff142.xml - 2025-12-04T14:25:34.4774737Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4775009Z FAILED [8.6183s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4775055Z Traceback (most recent call last): 2025-12-04T14:25:34.4775229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4775285Z getattr(self, test_name)() 2025-12-04T14:25:34.4775442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4775479Z fn() 2025-12-04T14:25:34.4775629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4775668Z method(*args, **kwargs) 2025-12-04T14:25:34.4775815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4775856Z method(*args, **kwargs) 2025-12-04T14:25:34.4776015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4776052Z with policy(): 2025-12-04T14:25:34.4776201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4776242Z raise RuntimeError(msg) 2025-12-04T14:25:34.4776606Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T14:25:34.4776608Z 2025-12-04T14:25:34.4776682Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4776936Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4776938Z 2025-12-04T14:25:34.4777022Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4777087Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4777148Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T14:25:34.4777184Z Got exit code 1 2025-12-04T14:25:34.4777223Z Retrying single test... 2025-12-04T14:25:34.4777450Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-21d24b4df4a7a8d1.xml 2025-12-04T14:25:34.4777506Z ============================= test session starts ============================== 2025-12-04T14:25:34.4777617Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4777656Z cachedir: .pytest_cache 2025-12-04T14:25:34.4777813Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4777857Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4777897Z configfile: pytest.ini 2025-12-04T14:25:34.4778057Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4778416Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4778466Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4778812Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4778868Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4778922Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4779179Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4779231Z Running 1 items in this shard 2025-12-04T14:25:34.4779233Z 2025-12-04T14:25:34.4779562Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 14:25:11.632000 375466 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 375535 2025-12-04T14:25:34.4779714Z I1204 14:25:11.632000 375466 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 375536 2025-12-04T14:25:34.4779883Z I1204 14:25:11.633000 375466 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 375537 2025-12-04T14:25:34.4780031Z I1204 14:25:11.633000 375466 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 375538 2025-12-04T14:25:34.4780796Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4780889Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4781596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4781687Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4782386Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4782472Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4783172Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4783258Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4783390Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4783542Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4783822Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4783997Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4784275Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4784388Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4784667Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4784807Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4785084Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4785225Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4785496Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4785625Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4785892Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4786033Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4786510Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T14:25:34.4786616Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4786804Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4787175Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4787284Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4787484Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4787639Z E1204 14:25:18.946000 375535 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4787768Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4787917Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4788202Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4788356Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4788634Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4788746Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4789023Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4789163Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4789437Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4789575Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4789841Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4789969Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4790264Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4790407Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4790877Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1243611136 and is now 2587885568. 2025-12-04T14:25:34.4790985Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4791171Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4791543Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4791649Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4791850Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4792005Z E1204 14:25:18.953000 375537 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4792131Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4792297Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4792587Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4792731Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4793006Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4793134Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4793404Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4793544Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4793823Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4793961Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4794228Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4794353Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4794623Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4794763Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4795233Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4795339Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4795527Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4795900Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4796004Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4796207Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4796362Z E1204 14:25:18.953000 375538 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4796490Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4796650Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4796943Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4797088Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4797371Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4797485Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4797752Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4797904Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4798171Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4798309Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4798578Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4798703Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4798971Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4799108Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4799576Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4799683Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4799869Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4800277Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4800381Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4800583Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4800736Z E1204 14:25:19.018000 375536 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4800788Z FAILED [8.4204s] [100%] 2025-12-04T14:25:34.4800803Z 2025-12-04T14:25:34.4800857Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4800964Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____ 2025-12-04T14:25:34.4801009Z Traceback (most recent call last): 2025-12-04T14:25:34.4801169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4801212Z self._join_processes(fn) 2025-12-04T14:25:34.4801383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4801449Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4801626Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4801669Z raise RuntimeError(error) 2025-12-04T14:25:34.4801748Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4801793Z Traceback (most recent call last): 2025-12-04T14:25:34.4801966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4802007Z getattr(self, test_name)() 2025-12-04T14:25:34.4802192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4804631Z fn() 2025-12-04T14:25:34.4804797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4804842Z method(*args, **kwargs) 2025-12-04T14:25:34.4804994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4805036Z method(*args, **kwargs) 2025-12-04T14:25:34.4805195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4805234Z with policy(): 2025-12-04T14:25:34.4805387Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4805427Z raise RuntimeError(msg) 2025-12-04T14:25:34.4805785Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1243611136 and is now 2587885568. 2025-12-04T14:25:34.4805788Z 2025-12-04T14:25:34.4805865Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4806123Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4806127Z 2025-12-04T14:25:34.4806215Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4806218Z 2025-12-04T14:25:34.4806219Z 2025-12-04T14:25:34.4806298Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4806384Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4806655Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-21d24b4df4a7a8d1.xml - 2025-12-04T14:25:34.4806716Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4806990Z FAILED [8.4204s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T14:25:34.4807060Z Traceback (most recent call last): 2025-12-04T14:25:34.4807237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4807281Z getattr(self, test_name)() 2025-12-04T14:25:34.4807441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4807475Z fn() 2025-12-04T14:25:34.4807624Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4807663Z method(*args, **kwargs) 2025-12-04T14:25:34.4807823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4807863Z method(*args, **kwargs) 2025-12-04T14:25:34.4808008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4808050Z with policy(): 2025-12-04T14:25:34.4808199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4808252Z raise RuntimeError(msg) 2025-12-04T14:25:34.4808606Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1243611136 and is now 2587885568. 2025-12-04T14:25:34.4808609Z 2025-12-04T14:25:34.4808683Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4808937Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4808943Z 2025-12-04T14:25:34.4809028Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4809091Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4809154Z ======================= 1 failed, 14 deselected in 8.58s ======================= 2025-12-04T14:25:34.4809190Z Got exit code 1 2025-12-04T14:25:34.4809229Z Retrying single test... 2025-12-04T14:25:34.4809454Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-7197ab72c271a347.xml 2025-12-04T14:25:34.4809511Z ============================= test session starts ============================== 2025-12-04T14:25:34.4809627Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4809666Z cachedir: .pytest_cache 2025-12-04T14:25:34.4809822Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4809870Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4809913Z configfile: pytest.ini 2025-12-04T14:25:34.4810076Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4810491Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4810542Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4810888Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4810965Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4811020Z collected 15 items / 14 deselected / 1 selected 2025-12-04T14:25:34.4811291Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4811335Z Running 1 items in this shard 2025-12-04T14:25:34.4811337Z 2025-12-04T14:25:34.4811665Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 14:25:22.713000 375868 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 375937 2025-12-04T14:25:34.4811831Z I1204 14:25:22.713000 375868 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 375938 2025-12-04T14:25:34.4811983Z I1204 14:25:22.714000 375868 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 375939 2025-12-04T14:25:34.4812132Z I1204 14:25:22.714000 375868 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 375940 2025-12-04T14:25:34.4812861Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4812953Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4813657Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4813750Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4814443Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4814531Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4815232Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T14:25:34.4815317Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T14:25:34.4815450Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4815602Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4815891Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4816047Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4816325Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4816437Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4816714Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4816856Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4817137Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4817276Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4817540Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4817668Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4817936Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4818077Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4818548Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T14:25:34.4818656Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4818844Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4819217Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4819324Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4819524Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4819680Z E1204 14:25:30.056000 375937 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T14:25:34.4819807Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4819967Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4820304Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4820448Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4820722Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4820849Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4821115Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4821256Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4821534Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4821672Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4821940Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4822065Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4822333Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4822473Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4822942Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4823048Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4823234Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4823606Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4823711Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4823913Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4824071Z E1204 14:25:30.123000 375938 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T14:25:34.4824197Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4824371Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4824646Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4824790Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4825075Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4825189Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4825454Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4825602Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4825868Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4826004Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4826271Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4826398Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4826667Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4826804Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4827273Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1256194048 and is now 2587885568. 2025-12-04T14:25:34.4827378Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4827565Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4827936Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4828039Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4828241Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4828395Z E1204 14:25:30.135000 375940 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T14:25:34.4828544Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T14:25:34.4828703Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T14:25:34.4828980Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4829122Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T14:25:34.4829406Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4829520Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T14:25:34.4829794Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4829932Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4830230Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4830371Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T14:25:34.4830636Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4830765Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T14:25:34.4831035Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4831174Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T14:25:34.4831644Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T14:25:34.4831750Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4831939Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4832310Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4832415Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T14:25:34.4832616Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4832785Z E1204 14:25:30.135000 375939 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T14:25:34.4832842Z FAILED [8.5207s] [100%] 2025-12-04T14:25:34.4832844Z 2025-12-04T14:25:34.4832898Z =================================== FAILURES =================================== 2025-12-04T14:25:34.4833005Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____ 2025-12-04T14:25:34.4833049Z Traceback (most recent call last): 2025-12-04T14:25:34.4833210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T14:25:34.4833253Z self._join_processes(fn) 2025-12-04T14:25:34.4833436Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T14:25:34.4833488Z self._check_return_codes(fn, elapsed_time) 2025-12-04T14:25:34.4833665Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T14:25:34.4833708Z raise RuntimeError(error) 2025-12-04T14:25:34.4833787Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4833848Z Traceback (most recent call last): 2025-12-04T14:25:34.4834011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4834052Z getattr(self, test_name)() 2025-12-04T14:25:34.4834209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4834242Z fn() 2025-12-04T14:25:34.4834391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4834430Z method(*args, **kwargs) 2025-12-04T14:25:34.4834578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4834618Z method(*args, **kwargs) 2025-12-04T14:25:34.4834765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4834802Z with policy(): 2025-12-04T14:25:34.4834952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4834991Z raise RuntimeError(msg) 2025-12-04T14:25:34.4835346Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T14:25:34.4835348Z 2025-12-04T14:25:34.4835422Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4835677Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4835680Z 2025-12-04T14:25:34.4835767Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4835769Z 2025-12-04T14:25:34.4835771Z 2025-12-04T14:25:34.4835844Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T14:25:34.4835930Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T14:25:34.4836196Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-7197ab72c271a347.xml - 2025-12-04T14:25:34.4836259Z =========================== short test summary info ============================ 2025-12-04T14:25:34.4836529Z FAILED [8.5207s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T14:25:34.4836597Z Traceback (most recent call last): 2025-12-04T14:25:34.4836759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T14:25:34.4836801Z getattr(self, test_name)() 2025-12-04T14:25:34.4836956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T14:25:34.4836990Z fn() 2025-12-04T14:25:34.4837137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4837191Z method(*args, **kwargs) 2025-12-04T14:25:34.4837340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T14:25:34.4837381Z method(*args, **kwargs) 2025-12-04T14:25:34.4837528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T14:25:34.4837566Z with policy(): 2025-12-04T14:25:34.4837723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T14:25:34.4837763Z raise RuntimeError(msg) 2025-12-04T14:25:34.4838114Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T14:25:34.4838116Z 2025-12-04T14:25:34.4838189Z To execute this test, run the following from the base repo dir: 2025-12-04T14:25:34.4838443Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4838447Z 2025-12-04T14:25:34.4838531Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T14:25:34.4838595Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T14:25:34.4838655Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T14:25:34.4838692Z Got exit code 1 2025-12-04T14:25:34.4838894Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T14:25:34.4839018Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T14:25:34.4839241Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ffec1b91353d960b.xml 2025-12-04T14:25:34.4839299Z ============================= test session starts ============================== 2025-12-04T14:25:34.4839410Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T14:25:34.4839452Z cachedir: .pytest_cache 2025-12-04T14:25:34.4839610Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T14:25:34.4839655Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T14:25:34.4839695Z configfile: pytest.ini 2025-12-04T14:25:34.4839858Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T14:25:34.4840242Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4840309Z class TestDummyModel(torch.nn.Module): 2025-12-04T14:25:34.4840653Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T14:25:34.4840721Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T14:25:34.4840776Z collected 15 items / 15 deselected / 0 selected 2025-12-04T14:25:34.4840827Z stepcurrent: skipping 15 already run items. 2025-12-04T14:25:34.4840869Z Running 0 items in this shard 2025-12-04T14:25:34.4840871Z 2025-12-04T14:25:34.4841147Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ffec1b91353d960b.xml - 2025-12-04T14:25:34.4841207Z ============================ 15 deselected in 0.01s ============================ 2025-12-04T14:25:34.4844949Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda'] 2025-12-04T14:25:34.4844957Z 2025-12-04T14:25:34.4845176Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_fsdp_dtensor_state_dict_1.1_b117e26eea61d004_.log) 2025-12-04T14:25:34.4845188Z 2025-12-04T14:25:34.4845329Z Finished distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 ... [2025-12-04 14:25:34.290517][2243444.780493297], took 8.55min 2025-12-04T14:25:34.4845605Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:25:34.4845693Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:25:34.4845786Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T14:25:34.4845836Z Uploading artifacts took 0.00 seconds 2025-12-04T14:25:34.4845906Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 failed! 2025-12-04T14:25:34.4846031Z Running distributed/fsdp/test_fsdp_comm_hooks 1/1 ... [2025-12-04 14:25:34.294197][2243444.784176875] 2025-12-04T14:25:34.4846079Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:25:34.4846402Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm_hooks.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:25:34.294413] 2025-12-04T14:28:29.7019252Z 2025-12-04T14:28:29.7020725Z distributed/fsdp/test_fsdp_comm_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_comm_hooks_1.1_7eac99cefc27e883_.log 2025-12-04T14:28:29.7026345Z Running 28 items in this shard: test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_hybrid_strategy, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy2 2025-12-04T14:28:29.7031728Z 2025-12-04T14:28:29.7031870Z Finished distributed/fsdp/test_fsdp_comm_hooks 1/1 ... [2025-12-04 14:28:29.702042][2243620.192017027], took 2.92min 2025-12-04T14:28:29.7046023Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:28:29.7061327Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:28:29.7063946Z Running distributed/_shard/test_sharder 1/1 ... [2025-12-04 14:28:29.706200][2243620.196181609] 2025-12-04T14:28:29.7064361Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:28:29.7065958Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/test_sharder.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:28:29.706406] 2025-12-04T14:28:41.3413728Z 2025-12-04T14:28:41.3414899Z distributed/_shard/test_sharder 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.test_sharder_1.1_b4c974959ce73dc9_.log 2025-12-04T14:28:41.3416056Z Running 2 items in this shard: test/distributed/_shard/test_sharder.py::TestCustomSharder::test_custom_sharder, test/distributed/_shard/test_sharder.py::TestCustomSharder::test_custom_sharder_errors 2025-12-04T14:28:41.3416669Z 2025-12-04T14:28:41.3416931Z Finished distributed/_shard/test_sharder 1/1 ... [2025-12-04 14:28:41.341068][2243631.831044182], took 0.19min 2025-12-04T14:28:41.3435653Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:28:41.3453388Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:28:41.3454442Z Running distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 ... [2025-12-04 14:28:41.345233][2243631.835214144] 2025-12-04T14:28:41.3454813Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:28:41.3455996Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_tensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:28:41.345447] 2025-12-04T14:29:06.2506082Z 2025-12-04T14:29:06.2507272Z distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_tensor_ops_1.1_981b77665d19fb6d_.log 2025-12-04T14:29:06.2509094Z Running 5 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_clone, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_deep_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_detach, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_inplace_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_set_requires_grad 2025-12-04T14:29:06.2511150Z 2025-12-04T14:29:06.2511425Z Finished distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 ... [2025-12-04 14:29:06.250352][2243656.740328794], took 0.42min 2025-12-04T14:29:06.2529648Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:29:06.2545216Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:29:06.2546907Z Running distributed/fsdp/test_fsdp_tp_integration 1/1 ... [2025-12-04 14:29:06.254582][2243656.744563425] 2025-12-04T14:29:06.2547217Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:29:06.2549067Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_tp_integration.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:29:06.254788] 2025-12-04T14:29:40.3754230Z 2025-12-04T14:29:40.3755377Z distributed/fsdp/test_fsdp_tp_integration 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_tp_integration_1.1_6ee65dfec3291e98_.log 2025-12-04T14:29:40.3757044Z Running 3 items in this shard: test/distributed/fsdp/test_fsdp_tp_integration.py::TestTPFSDPIntegration::test_fsdp_tp_extension_grad, test/distributed/fsdp/test_fsdp_tp_integration.py::TestTPFSDPIntegration::test_fsdp_tp_integration, test/distributed/fsdp/test_fsdp_tp_integration.py::TestTPFSDPIntegration::test_fsdp_tp_sync_module_state 2025-12-04T14:29:40.3758068Z 2025-12-04T14:29:40.3758364Z Finished distributed/fsdp/test_fsdp_tp_integration 1/1 ... [2025-12-04 14:29:40.375063][2243690.865039437], took 0.57min 2025-12-04T14:29:40.3777401Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:29:40.3793948Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:29:40.3796546Z Running distributed/_shard/sharded_optim/test_sharded_optim 1/1 ... [2025-12-04 14:29:40.379532][2243690.869513004] 2025-12-04T14:29:40.3796875Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:29:40.3798896Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_optim/test_sharded_optim.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:29:40.379734] 2025-12-04T14:29:52.5152553Z 2025-12-04T14:29:52.5154041Z distributed/_shard/sharded_optim/test_sharded_optim 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_optim.test_sharded_optim_1.1_cff2312f2f4b15f1_.log 2025-12-04T14:29:52.5156031Z Running 2 items in this shard: test/distributed/_shard/sharded_optim/test_sharded_optim.py::TestShardedOptimizer::test_named_params_with_sharded_tensor, test/distributed/_shard/sharded_optim/test_sharded_optim.py::TestShardedOptimizer::test_sharded_optim 2025-12-04T14:29:52.5157115Z 2025-12-04T14:29:52.5157573Z Finished distributed/_shard/sharded_optim/test_sharded_optim 1/1 ... [2025-12-04 14:29:52.514913][2243703.004888165], took 0.20min 2025-12-04T14:29:52.5169983Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:29:52.5187289Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:29:52.5188900Z Running distributed/_composable/fsdp/test_fully_shard_state_dict 1/1 ... [2025-12-04 14:29:52.518769][2243703.008749291] 2025-12-04T14:29:52.5189423Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:29:52.5190951Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:29:52.518963] 2025-12-04T14:30:32.9458025Z 2025-12-04T14:30:32.9459271Z distributed/_composable/fsdp/test_fully_shard_state_dict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_state_dict_1.1_21b049e7a8aa310c_.log 2025-12-04T14:30:32.9461231Z Running 7 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_state_dict.py::TestFullyShardStateDictMultiProcess::test_2d_state_dict_correctness, test/distributed/_composable/fsdp/test_fully_shard_state_dict.py::TestFullyShardStateDictMultiProcess::test_cached_state_dict, test/distributed/_composable/fsdp/test_fully_shard_state_dict.py::TestFullyShardStateDictMultiProcess::test_dp_state_dict_cpu_offload, test/distributed/_composable/fsdp/test_fully_shard_state_dict.py::TestFullyShardStateDictMultiProcess::test_dp_state_dict_save_load, test/distributed/_composable/fsdp/test_fully_shard_state_dict.py::TestFullyShardStateDictMultiProcess::test_dp_tp_state_dict_save_load, test/distributed/_composable/fsdp/test_fully_shard_state_dict.py::TestFullyShardStateDictMultiProcess::test_hsdp_tp_state_dict_save_load, test/distributed/_composable/fsdp/test_fully_shard_state_dict.py::TestFullyShardStateDictMultiThread::test_rank0_offload_full_state_dict 2025-12-04T14:30:32.9462642Z 2025-12-04T14:30:32.9462805Z Finished distributed/_composable/fsdp/test_fully_shard_state_dict 1/1 ... [2025-12-04 14:30:32.945471][2243743.435446428], took 0.67min 2025-12-04T14:30:32.9482491Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:30:32.9497952Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:30:32.9502962Z Running distributed/test_c10d_pypg 1/1 ... [2025-12-04 14:30:32.949931][2243743.439911925] 2025-12-04T14:30:32.9503178Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:30:32.9503584Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_pypg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:30:32.950137] 2025-12-04T14:30:40.0751460Z 2025-12-04T14:30:40.0752546Z distributed/test_c10d_pypg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_pypg_1.1_80f79431cd0541aa_.log 2025-12-04T14:30:40.0767936Z Running 48 items in this shard: test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_no_init_sync, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_no_init_sync, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_abort_shutdown, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_attr_overrides, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_block_current_stream, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_block_current_stream_use_after_free 2025-12-04T14:30:40.0777687Z 2025-12-04T14:30:40.0777888Z Finished distributed/test_c10d_pypg 1/1 ... [2025-12-04 14:30:40.074836][2243750.564811366], took 0.12min 2025-12-04T14:30:40.0778405Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:30:40.0794768Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:30:40.0796029Z Running distributed/test_pg_wrapper 1/1 ... [2025-12-04 14:30:40.079426][2243750.569406672] 2025-12-04T14:30:40.0796234Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:30:40.0798034Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_pg_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:30:40.079631] 2025-12-04T14:32:18.3763944Z 2025-12-04T14:32:18.3764967Z distributed/test_pg_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_pg_wrapper_1.1_29959350be7bbbae_.log 2025-12-04T14:32:18.3773042Z Running 17 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_coalescing_manager_debug_mode_detail, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_hang, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_detail, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_off, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_debug_level_detail_no_gloo, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_new_group_no_gloo, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_hang, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode_off, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_debug_mode 2025-12-04T14:32:18.3778506Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_coalescing_manager_debug_mode_detail 2025-12-04T14:32:18.3779190Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_hang 2025-12-04T14:32:18.3779878Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_detail 2025-12-04T14:32:18.3780679Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_off 2025-12-04T14:32:18.3781403Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch 2025-12-04T14:32:18.3782085Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch_debug_mode 2025-12-04T14:32:18.3782632Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_debug_level_detail_no_gloo 2025-12-04T14:32:18.3783219Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_new_group_no_gloo 2025-12-04T14:32:18.3783788Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_hang 2025-12-04T14:32:18.3784263Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda 2025-12-04T14:32:18.3784787Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda_debug_mode 2025-12-04T14:32:18.3785331Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode 2025-12-04T14:32:18.3785945Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode_off 2025-12-04T14:32:18.3786452Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch 2025-12-04T14:32:18.3786934Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda 2025-12-04T14:32:18.3787483Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda_debug_mode 2025-12-04T14:32:18.3788006Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_debug_mode 2025-12-04T14:32:18.3788300Z 2025-12-04T14:32:18.3788462Z Finished distributed/test_pg_wrapper 1/1 ... [2025-12-04 14:32:18.376444][2243848.866420361], took 1.64min 2025-12-04T14:32:18.3790346Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:32:18.3806555Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:32:18.3808939Z Running distributed/_shard/sharded_tensor/ops/test_binary_cmp 1/1 ... [2025-12-04 14:32:18.380791][2243848.87077244] 2025-12-04T14:32:18.3809182Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:32:18.3811062Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:32:18.380986] 2025-12-04T14:32:39.2278189Z 2025-12-04T14:32:39.2282911Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_binary_cmp_1.1_932e5a4ccf1aa3d8_.log 2025-12-04T14:32:39.2284753Z Running 4 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_binary_cmp.py::TestShardedTensorBinaryOps::test_torch_allclose, test/distributed/_shard/sharded_tensor/ops/test_binary_cmp.py::TestShardedTensorBinaryOps::test_torch_allclose_tensor_specs, test/distributed/_shard/sharded_tensor/ops/test_binary_cmp.py::TestShardedTensorBinaryOps::test_torch_equal, test/distributed/_shard/sharded_tensor/ops/test_binary_cmp.py::TestShardedTensorBinaryOps::test_torch_equal_tensor_specs 2025-12-04T14:32:39.2286018Z 2025-12-04T14:32:39.2286304Z Finished distributed/_shard/sharded_tensor/ops/test_binary_cmp 1/1 ... [2025-12-04 14:32:39.227616][2243869.717591145], took 0.35min 2025-12-04T14:32:39.2303406Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:32:39.2318724Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:32:39.2320724Z Running distributed/nn/jit/test_instantiator 1/1 ... [2025-12-04 14:32:39.231921][2243869.721902325] 2025-12-04T14:32:39.2321489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:32:39.2322532Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/nn/jit/test_instantiator.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:32:39.232130] 2025-12-04T14:32:41.4003000Z 2025-12-04T14:32:41.4003535Z distributed/nn/jit/test_instantiator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.nn.jit.test_instantiator_1.1_ba4d62ce1f872002_.log 2025-12-04T14:32:41.4005155Z Running 3 items in this shard: test/distributed/nn/jit/test_instantiator.py::TestInstantiator::test_get_arg_return_types_from_interface, test/distributed/nn/jit/test_instantiator.py::TestInstantiator::test_instantiate_non_scripted_remote_module_template, test/distributed/nn/jit/test_instantiator.py::TestInstantiator::test_instantiate_scripted_remote_module_template 2025-12-04T14:32:41.4006069Z 2025-12-04T14:32:41.4006297Z Finished distributed/nn/jit/test_instantiator 1/1 ... [2025-12-04 14:32:41.400058][2243871.89003206], took 0.04min 2025-12-04T14:32:41.4029367Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:32:41.4043953Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:32:41.4046310Z Running distributed/_shard/sharding_spec/test_sharding_spec 1/1 ... [2025-12-04 14:32:41.404444][2243871.894424868] 2025-12-04T14:32:41.4046616Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:32:41.4047984Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharding_spec/test_sharding_spec.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:32:41.404649] 2025-12-04T14:32:55.6956341Z 2025-12-04T14:32:55.6957476Z distributed/_shard/sharding_spec/test_sharding_spec 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharding_spec.test_sharding_spec_1.1_f628a53de31b1782_.log 2025-12-04T14:32:55.6961696Z Running 11 items in this shard: test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_check_overlapping, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_chunked_sharding_spec, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_device_placement, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_enumerable_sharding_spec, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_get_chunk_sharding_params, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_get_chunked_dim_size, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_get_split_size, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestShardingSpec::test_infer_sharding_spec_from_shards_metadata, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestCustomShardingSpec::test_custom_sharding_spec, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestCustomShardingSpec::test_custom_sharding_spec_shard_tensor, test/distributed/_shard/sharding_spec/test_sharding_spec.py::TestCustomShardingSpec::test_custom_sharding_spec_tensor_ctor 2025-12-04T14:32:55.6965087Z 2025-12-04T14:32:55.6965380Z Finished distributed/_shard/sharding_spec/test_sharding_spec 1/1 ... [2025-12-04 14:32:55.695274][2243886.18525029], took 0.24min 2025-12-04T14:32:55.6980301Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:32:55.6997301Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:32:55.6998309Z Running distributed/test_nccl 1/1 ... [2025-12-04 14:32:55.699623][2243886.189603838] 2025-12-04T14:32:55.6998624Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:32:55.7000135Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_nccl.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:32:55.699825] 2025-12-04T14:33:05.9810279Z 2025-12-04T14:33:05.9811281Z distributed/test_nccl 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_nccl_1.1_16d17b09b63152fd_.log 2025-12-04T14:33:05.9817120Z Running 15 items in this shard: test/distributed/test_nccl.py::NCCLSymmetricMemoryTest::test_nccl_symmem_alloc, test/distributed/test_nccl.py::TestNCCLCUDA::test_all_gather_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_all_gather_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_all_reduce_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_all_reduce_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_float8_e4m3fnuz, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_float8_e5m2fnuz, test/distributed/test_nccl.py::TestNCCLCUDA::test_collective_errors_cuda, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_scatter_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_scatter_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_unique_id_cuda 2025-12-04T14:33:05.9820781Z 2025-12-04T14:33:05.9820895Z Finished distributed/test_nccl 1/1 ... [2025-12-04 14:33:05.980657][2243896.47063464], took 0.17min 2025-12-04T14:33:05.9829713Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:33:05.9843579Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:33:05.9846406Z Running distributed/fsdp/test_fsdp_misc 1/1 ... [2025-12-04 14:33:05.984434][2243896.474415596] 2025-12-04T14:33:05.9846828Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:33:05.9847706Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_misc.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:33:05.984634] 2025-12-04T14:34:28.2329556Z 2025-12-04T14:34:28.2330570Z distributed/fsdp/test_fsdp_misc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_misc_1.1_935d50eab34d891a_.log 2025-12-04T14:34:28.2339989Z Running 28 items in this shard: test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_cpu_init_with_sync_module_states, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_cpu_init_stays_on_cpu, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_cpu_training, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_device_id_use_index_False, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_device_id_use_index_True, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_not_all_outputs_used_in_loss, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_optim_overlap_no_use_orig_params_error, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_optimizer_overlap, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_zero2_eval_with_prefetch, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_cpu_gpu_module, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_device_id_auto_wrap, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_device_id_cpu_offload, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_device_id_no_move_ignored_params_and_bufs, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_ignored_module_meta, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_namedtuple, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_same_model_across_ranks, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_unsupported_module_cls, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_homogeneous_attributes, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_module_device_mismatches_device_id, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_multigpu_module, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_no_params, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscWorldSize1::test_training_device_mismatch_errors, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscWorldSize1::test_unsafe_setattr, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscWorldSize1::test_world_size_1_sharding_strategy_warning 2025-12-04T14:34:28.2347417Z 2025-12-04T14:34:28.2347641Z Finished distributed/fsdp/test_fsdp_misc 1/1 ... [2025-12-04 14:34:28.232658][2243978.722635995], took 1.37min 2025-12-04T14:34:28.2348806Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:34:28.2364355Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:34:28.2366540Z Running distributed/fsdp/test_fsdp_meta 1/1 ... [2025-12-04 14:34:28.236513][2243978.726493321] 2025-12-04T14:34:28.2366765Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:34:28.2368498Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_meta.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:34:28.236743] 2025-12-04T14:35:24.4932267Z 2025-12-04T14:35:24.4937150Z distributed/fsdp/test_fsdp_meta 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_meta_1.1_426aeec0757fbe63_.log 2025-12-04T14:35:24.4944656Z Running 15 items in this shard: test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_bad_arg_meta, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_bad_arg_torchdistx, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_meta_device_with_mixed_precision, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_meta_device_default_init_auto_wrap_False, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_meta_device_default_init_auto_wrap_True, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_meta_device_reset_params_auto_wrap_False, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_meta_device_reset_params_auto_wrap_True, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_torchdistX_default_init_auto_wrap_False, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_torchdistX_default_init_auto_wrap_True, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_torchdistX_init_fn_auto_wrap_False, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_nested_model_with_torchdistX_init_fn_auto_wrap_True, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_simple_model_with_meta_device_default_init, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_simple_model_with_meta_device_reset_params, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_simple_model_with_torchdistX_default_init, test/distributed/fsdp/test_fsdp_meta.py::TestFSDPWithMetaDevice::test_simple_model_with_torchdistX_init_fn 2025-12-04T14:35:24.4949626Z 2025-12-04T14:35:24.4949859Z Finished distributed/fsdp/test_fsdp_meta 1/1 ... [2025-12-04 14:35:24.492901][2244034.982876942], took 0.94min 2025-12-04T14:35:24.4960040Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:35:24.4974756Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:35:24.4976772Z Running distributed/fsdp/test_fsdp_unshard_params 1/1 ... [2025-12-04 14:35:24.497538][2244034.987519337] 2025-12-04T14:35:24.4977043Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:35:24.4979210Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_unshard_params.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:35:24.497731] 2025-12-04T14:36:30.1762769Z 2025-12-04T14:36:30.1763780Z distributed/fsdp/test_fsdp_unshard_params 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_unshard_params_1.1_ebe89ada6b693b93_.log 2025-12-04T14:36:30.1769951Z Running 15 items in this shard: test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_named_parameters_and_buffers, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_param_data, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_recurse, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_respects_reshard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_writeback, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_singleton_param_writeback, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_submodule, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_with_grads_core, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_with_grads_none_grads, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsNoShard::test_unshard_params_param_data_no_shard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsNoShard::test_unshard_params_writeback_no_shard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_offload_to_cpu_no_shard_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_rank0_only_with_writeback_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_unshard_params_from_backward_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_unshard_params_from_forward_raises 2025-12-04T14:36:30.1774683Z 2025-12-04T14:36:30.1774942Z Finished distributed/fsdp/test_fsdp_unshard_params 1/1 ... [2025-12-04 14:36:30.176004][2244100.665980014], took 1.09min 2025-12-04T14:36:30.1788418Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:36:30.1804965Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:36:30.1805666Z Running distributed/checkpoint/test_state_dict_utils 1/1 ... [2025-12-04 14:36:30.180452][2244100.670433731] 2025-12-04T14:36:30.1805966Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:36:30.1808704Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_state_dict_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:36:30.180650] 2025-12-04T14:37:04.8003208Z 2025-12-04T14:37:04.8004318Z distributed/checkpoint/test_state_dict_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_state_dict_utils_1.1_ab1a6ac55e1ad4c9_.log 2025-12-04T14:37:04.8008627Z Running 7 items in this shard: test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_complicated_dict, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_cpu_and_ranks_only, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_cpu_offload_for_dtensor, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_create_cpu_state_dict, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_gather_state_dict_dtensor, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_gather_with_cpu_and_ranks_only, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_state_dict_util_distribute_tensors 2025-12-04T14:37:04.8011120Z 2025-12-04T14:37:04.8011317Z Finished distributed/checkpoint/test_state_dict_utils 1/1 ... [2025-12-04 14:37:04.800118][2244135.290094283], took 0.58min 2025-12-04T14:37:04.8027894Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:37:04.8043192Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:37:04.8047467Z Running distributed/_shard/sharded_tensor/ops/test_init 1/1 ... [2025-12-04 14:37:04.804464][2244135.294445382] 2025-12-04T14:37:04.8047718Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:37:04.8048202Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:37:04.804666] 2025-12-04T14:37:21.2442532Z 2025-12-04T14:37:21.2443678Z distributed/_shard/sharded_tensor/ops/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_init_1.1_fd318445c49edc59_.log 2025-12-04T14:37:21.2445876Z Running 3 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_kaiming_uniform, test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_normal, test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_uniform 2025-12-04T14:37:21.2447300Z 2025-12-04T14:37:21.2447682Z Finished distributed/_shard/sharded_tensor/ops/test_init 1/1 ... [2025-12-04 14:37:21.243989][2244151.733965144], took 0.27min 2025-12-04T14:37:21.2466059Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:37:21.2483145Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:37:21.2483730Z Running distributed/_shard/sharded_tensor/ops/test_embedding 1/1 ... [2025-12-04 14:37:21.248235][2244151.738216644] 2025-12-04T14:37:21.2484121Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:37:21.2486074Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:37:21.248439] 2025-12-04T14:37:34.0851672Z 2025-12-04T14:37:34.0853241Z distributed/_shard/sharded_tensor/ops/test_embedding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_embedding_1.1_b03bf196836f15d3_.log 2025-12-04T14:37:34.0854754Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_embedding.py::TestShardedEmbedding::test_sharded_embedding_colwise, test/distributed/_shard/sharded_tensor/ops/test_embedding.py::TestShardedEmbedding::test_sharded_embedding_rowwise 2025-12-04T14:37:34.0855632Z 2025-12-04T14:37:34.0856164Z Finished distributed/_shard/sharded_tensor/ops/test_embedding 1/1 ... [2025-12-04 14:37:34.084818][2244164.574794525], took 0.21min 2025-12-04T14:37:34.0877662Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:37:34.0892844Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:37:34.0895090Z Running distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 ... [2025-12-04 14:37:34.089330][2244164.579311611] 2025-12-04T14:37:34.0895488Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:37:34.0896502Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:37:34.089537] 2025-12-04T14:37:47.1769565Z 2025-12-04T14:37:47.1770989Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_embedding_bag_1.1_5f70f166393bd1ef_.log 2025-12-04T14:37:47.1772806Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_embedding_bag.py::TestShardedEmbeddingBag::test_sharded_embedding_bag_colwise, test/distributed/_shard/sharded_tensor/ops/test_embedding_bag.py::TestShardedEmbeddingBag::test_sharded_embedding_bag_rowwise 2025-12-04T14:37:47.1774063Z 2025-12-04T14:37:47.1774797Z Finished distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 ... [2025-12-04 14:37:47.176662][2244177.666639483], took 0.22min 2025-12-04T14:37:47.1790034Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:37:47.1806617Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:37:47.1807188Z Running distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 ... [2025-12-04 14:37:47.180541][2244177.670522859] 2025-12-04T14:37:47.1807651Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:37:47.1809036Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:37:47.180745] 2025-12-04T14:37:58.9163465Z 2025-12-04T14:37:58.9164806Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.test_sharded_tensor_reshard_1.1_3bb6da4b93497946_.log 2025-12-04T14:37:58.9167621Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard, test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard_errors 2025-12-04T14:37:58.9168549Z 2025-12-04T14:37:58.9168871Z Finished distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 ... [2025-12-04 14:37:58.916064][2244189.406038983], took 0.20min 2025-12-04T14:37:58.9191890Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T14:37:58.9208922Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T14:37:58.9209318Z Running distributed/fsdp/test_fsdp_core 2/2 ... [2025-12-04 14:37:58.920753][2244189.410734427] 2025-12-04T14:37:58.9209643Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T14:37:58.9213324Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 14:37:58.920966] 2025-12-04T15:04:54.6232447Z 2025-12-04T15:04:54.6233559Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 2/2 (test/test-reports/distributed.fsdp.test_fsdp_core_2.2_b1f81712a7b176a7_.log) 2025-12-04T15:04:54.6234763Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-22f758e0911cec15.xml 2025-12-04T15:04:54.6235761Z ============================= test session starts ============================== 2025-12-04T15:04:54.6236295Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6236709Z cachedir: .pytest_cache 2025-12-04T15:04:54.6237150Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6237685Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6237993Z configfile: pytest.ini 2025-12-04T15:04:54.6238527Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6239088Z collecting ... collected 60 items 2025-12-04T15:04:54.6239379Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T15:04:54.6248201Z Running 27 items in this shard: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.6254390Z 2025-12-04T15:04:54.6254808Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 14:38:00.695000 413688 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 413757 2025-12-04T15:04:54.6255493Z I1204 14:38:00.696000 413688 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 413758 2025-12-04T15:04:54.6255953Z I1204 14:38:00.696000 413688 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 413759 2025-12-04T15:04:54.6256439Z I1204 14:38:00.697000 413688 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 413760 2025-12-04T15:04:54.6257125Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6257625Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6258082Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6258536Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6259145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6259785Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6260300Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6260812Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6261410Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6262024Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6262519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6262976Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6263610Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6264254Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6264862Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6265468Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6267089Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6268578Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6270008Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6271535Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6273011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6274426Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6275842Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6277263Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6277568Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6277916Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6278409Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6278897Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6279378Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6279879Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6280399Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6280899Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6281380Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6281873Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6282335Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6282827Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6283286Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6283803Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6284472Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T15:04:54.6285102Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6285483Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6286086Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6286611Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6286977Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6287390Z [rank0]:E1204 14:38:08.060000 413757 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6287634Z dist init r=0, world=4 2025-12-04T15:04:54.6287842Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6288351Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6288837Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6289321Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6289796Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6290282Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6290768Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6291257Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6291720Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6292205Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6292681Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6293141Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6293617Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6294126Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6294795Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T15:04:54.6295430Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6295783Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6296376Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6296947Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6297345Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6326843Z [rank3]:E1204 14:38:08.061000 413760 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6327110Z dist init r=3, world=4 2025-12-04T15:04:54.6327351Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6327757Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6328361Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6328908Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6329460Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6329939Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6330425Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6330892Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6331387Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6331940Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6332439Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6332900Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6333359Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6333911Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6334669Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T15:04:54.6335323Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6335702Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6336341Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6336896Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6337314Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6337808Z [rank1]:E1204 14:38:08.102000 413758 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6338068Z dist init r=1, world=4 2025-12-04T15:04:54.6338318Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6338738Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6339314Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6339905Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6340429Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6340874Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6341348Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6341826Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6342337Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6342810Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6343278Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6343744Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6344209Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6344687Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6345357Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T15:04:54.6345992Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6346350Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6346955Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6347467Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6347840Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6348260Z [rank2]:E1204 14:38:08.106000 413759 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6348512Z dist init r=2, world=4 2025-12-04T15:04:54.6348937Z [rank0]:[W1204 14:38:08.739841823 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6349426Z FAILED [9.2208s] [ 3%] 2025-12-04T15:04:54.6349492Z 2025-12-04T15:04:54.6349566Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6349778Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____ 2025-12-04T15:04:54.6349960Z Traceback (most recent call last): 2025-12-04T15:04:54.6350269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6350519Z self._join_processes(fn) 2025-12-04T15:04:54.6350805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6351072Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6351346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6351608Z raise RuntimeError(error) 2025-12-04T15:04:54.6351761Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6351945Z Traceback (most recent call last): 2025-12-04T15:04:54.6352186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6352427Z getattr(self, test_name)() 2025-12-04T15:04:54.6352662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6352895Z fn() 2025-12-04T15:04:54.6353100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6353334Z method(*args, **kwargs) 2025-12-04T15:04:54.6353551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6353781Z method(*args, **kwargs) 2025-12-04T15:04:54.6354012Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6354242Z with policy(): 2025-12-04T15:04:54.6354454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6354684Z raise RuntimeError(msg) 2025-12-04T15:04:54.6355100Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T15:04:54.6355490Z 2025-12-04T15:04:54.6355568Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6355908Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6356173Z 2025-12-04T15:04:54.6356266Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6356394Z 2025-12-04T15:04:54.6356396Z 2025-12-04T15:04:54.6356477Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6356683Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6357040Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-22f758e0911cec15.xml - 2025-12-04T15:04:54.6357379Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6357732Z FAILED [9.2208s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6358107Z Traceback (most recent call last): 2025-12-04T15:04:54.6358352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6358595Z getattr(self, test_name)() 2025-12-04T15:04:54.6358829Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6359067Z fn() 2025-12-04T15:04:54.6359268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6359504Z method(*args, **kwargs) 2025-12-04T15:04:54.6359747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6359979Z method(*args, **kwargs) 2025-12-04T15:04:54.6360257Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6360493Z with policy(): 2025-12-04T15:04:54.6360742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6360977Z raise RuntimeError(msg) 2025-12-04T15:04:54.6361397Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T15:04:54.6361776Z 2025-12-04T15:04:54.6361857Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6362198Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6362465Z 2025-12-04T15:04:54.6362556Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6362748Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6362911Z ============================== 1 failed in 9.38s =============================== 2025-12-04T15:04:54.6363048Z Got exit code 1 2025-12-04T15:04:54.6363148Z Retrying single test... 2025-12-04T15:04:54.6363402Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a9d771ba7b7f17d8.xml 2025-12-04T15:04:54.6363685Z ============================= test session starts ============================== 2025-12-04T15:04:54.6363900Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6364091Z cachedir: .pytest_cache 2025-12-04T15:04:54.6364316Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6364559Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6364678Z configfile: pytest.ini 2025-12-04T15:04:54.6364907Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6365179Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6365509Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6365805Z Running 1 items in this shard 2025-12-04T15:04:54.6365877Z 2025-12-04T15:04:54.6366183Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 14:38:12.650000 414090 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 414159 2025-12-04T15:04:54.6366710Z I1204 14:38:12.651000 414090 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 414160 2025-12-04T15:04:54.6367091Z I1204 14:38:12.651000 414090 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 414161 2025-12-04T15:04:54.6367432Z I1204 14:38:12.652000 414090 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 414162 2025-12-04T15:04:54.6367984Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6368423Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6369178Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6369790Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6370297Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6370734Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6371305Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6371885Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6372335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6372766Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6373335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6373921Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6374366Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6374797Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6375359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6375938Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6377327Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6378784Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6380281Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6381687Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6383118Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6384532Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6385949Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6387409Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6387717Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6388060Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6388566Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6389051Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6389541Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6389992Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6390495Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6390962Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6391430Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6391908Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6392372Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6392832Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6393293Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6393769Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6394442Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T15:04:54.6395061Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6395413Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6396008Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6396564Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6396938Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6397357Z [rank1]:E1204 14:38:19.851000 414160 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6397608Z dist init r=1, world=4 2025-12-04T15:04:54.6397841Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6398187Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6398687Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6399188Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6399673Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6400132Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6400697Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6401173Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6401645Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6402116Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6402589Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6403057Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6403520Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6404016Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6404690Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T15:04:54.6405314Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6405692Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6406295Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6406813Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6407199Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6407622Z [rank2]:E1204 14:38:19.912000 414161 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6407875Z dist init r=2, world=4 2025-12-04T15:04:54.6408091Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6408461Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6408960Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6409447Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6409936Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6410431Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6410882Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6411355Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6411831Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6412304Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6412778Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6413242Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6413711Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6414191Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6414861Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T15:04:54.6415524Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6415882Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6416481Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6417018Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6417396Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6417819Z [rank3]:E1204 14:38:19.927000 414162 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6418106Z dist init r=3, world=4 2025-12-04T15:04:54.6418321Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6418673Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6419170Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6419664Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6420141Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6420630Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6421068Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6421541Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6422005Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6422467Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6422932Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6423384Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6423846Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6424314Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6425025Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T15:04:54.6425643Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6425993Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6426597Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6427101Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6427486Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6427904Z [rank0]:E1204 14:38:19.931000 414159 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6428147Z dist init r=0, world=4 2025-12-04T15:04:54.6428548Z [rank0]:[W1204 14:38:20.805797327 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6428956Z FAILED [9.0205s] [100%] 2025-12-04T15:04:54.6429025Z 2025-12-04T15:04:54.6429084Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6429280Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____ 2025-12-04T15:04:54.6429465Z Traceback (most recent call last): 2025-12-04T15:04:54.6429715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6429964Z self._join_processes(fn) 2025-12-04T15:04:54.6430262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6430528Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6430797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6431059Z raise RuntimeError(error) 2025-12-04T15:04:54.6431213Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6431379Z Traceback (most recent call last): 2025-12-04T15:04:54.6431620Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6431861Z getattr(self, test_name)() 2025-12-04T15:04:54.6432092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6432325Z fn() 2025-12-04T15:04:54.6432528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6432760Z method(*args, **kwargs) 2025-12-04T15:04:54.6432984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6433216Z method(*args, **kwargs) 2025-12-04T15:04:54.6433460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6433710Z with policy(): 2025-12-04T15:04:54.6433925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6434158Z raise RuntimeError(msg) 2025-12-04T15:04:54.6434572Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T15:04:54.6434953Z 2025-12-04T15:04:54.6435028Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6435386Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6435654Z 2025-12-04T15:04:54.6435742Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6435870Z 2025-12-04T15:04:54.6435871Z 2025-12-04T15:04:54.6435950Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6436176Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6436535Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a9d771ba7b7f17d8.xml - 2025-12-04T15:04:54.6436864Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6437211Z FAILED [9.0205s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6437539Z Traceback (most recent call last): 2025-12-04T15:04:54.6437790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6438037Z getattr(self, test_name)() 2025-12-04T15:04:54.6438269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6438503Z fn() 2025-12-04T15:04:54.6438706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6438942Z method(*args, **kwargs) 2025-12-04T15:04:54.6439162Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6439393Z method(*args, **kwargs) 2025-12-04T15:04:54.6439611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6439839Z with policy(): 2025-12-04T15:04:54.6440053Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6440342Z raise RuntimeError(msg) 2025-12-04T15:04:54.6440759Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T15:04:54.6441141Z 2025-12-04T15:04:54.6441217Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6441556Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6441820Z 2025-12-04T15:04:54.6441907Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6442126Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6442314Z ======================= 1 failed, 26 deselected in 9.18s ======================= 2025-12-04T15:04:54.6442454Z Got exit code 1 2025-12-04T15:04:54.6442559Z Retrying single test... 2025-12-04T15:04:54.6442815Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9be31e2ef266a3ef.xml 2025-12-04T15:04:54.6443099Z ============================= test session starts ============================== 2025-12-04T15:04:54.6443312Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6443503Z cachedir: .pytest_cache 2025-12-04T15:04:54.6443743Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6443986Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6444110Z configfile: pytest.ini 2025-12-04T15:04:54.6444341Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6444619Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6444966Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6445263Z Running 1 items in this shard 2025-12-04T15:04:54.6445339Z 2025-12-04T15:04:54.6445644Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 14:38:24.574000 414492 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 414561 2025-12-04T15:04:54.6446144Z I1204 14:38:24.575000 414492 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 414562 2025-12-04T15:04:54.6446490Z I1204 14:38:24.575000 414492 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 414563 2025-12-04T15:04:54.6446837Z I1204 14:38:24.576000 414492 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 414564 2025-12-04T15:04:54.6447386Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6447826Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6448408Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6448995Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6449446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6449881Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6450488Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6451073Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6451543Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6451999Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6452567Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6453170Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6453622Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6454059Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6454640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6455224Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6456586Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6457999Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6459421Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6460874Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6462310Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6463724Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6465152Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.6466558Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.6466864Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6467212Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6467703Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6468186Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6468666Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6469116Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6469555Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6470023Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6470529Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6471018Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6471502Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6471954Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6472408Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6472889Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6473568Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T15:04:54.6474195Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6474543Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6475130Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6475634Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6476001Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6476413Z [rank1]:E1204 14:38:31.829000 414562 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6476658Z dist init r=1, world=4 2025-12-04T15:04:54.6476864Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6477206Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6477690Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6478170Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6478648Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6479095Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6479536Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6479998Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6480508Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6480995Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6481459Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6481956Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6482409Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6482877Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6483550Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T15:04:54.6484174Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6484527Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6485118Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6485631Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6485995Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6486406Z [rank2]:E1204 14:38:31.834000 414563 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6486650Z dist init r=2, world=4 2025-12-04T15:04:54.6486853Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6487193Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6487676Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6488159Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6488640Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6489086Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6489539Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6490027Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6490528Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6491000Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6491488Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6491953Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6492430Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6492900Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6493572Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T15:04:54.6494194Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6494543Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6495131Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6495636Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6495999Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6496409Z [rank3]:E1204 14:38:31.838000 414564 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6496651Z dist init r=3, world=4 2025-12-04T15:04:54.6496851Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6497186Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6497668Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6498149Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6498626Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6499111Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6499553Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6500017Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6500533Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6500998Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6501478Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6501937Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6502397Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6502871Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6503531Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T15:04:54.6504152Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6504498Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6505084Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6505584Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6505946Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6506358Z [rank0]:E1204 14:38:31.903000 414561 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6506602Z dist init r=0, world=4 2025-12-04T15:04:54.6506998Z [rank0]:[W1204 14:38:32.660734746 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6507405Z FAILED [9.3195s] [100%] 2025-12-04T15:04:54.6507468Z 2025-12-04T15:04:54.6507530Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6507722Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____ 2025-12-04T15:04:54.6507917Z Traceback (most recent call last): 2025-12-04T15:04:54.6508177Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6508423Z self._join_processes(fn) 2025-12-04T15:04:54.6508670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6508932Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6509199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6509457Z raise RuntimeError(error) 2025-12-04T15:04:54.6509622Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6509784Z Traceback (most recent call last): 2025-12-04T15:04:54.6510022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6510309Z getattr(self, test_name)() 2025-12-04T15:04:54.6510561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6510795Z fn() 2025-12-04T15:04:54.6510999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6511230Z method(*args, **kwargs) 2025-12-04T15:04:54.6511453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6511682Z method(*args, **kwargs) 2025-12-04T15:04:54.6511903Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6512132Z with policy(): 2025-12-04T15:04:54.6512349Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6512585Z raise RuntimeError(msg) 2025-12-04T15:04:54.6513006Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T15:04:54.6513387Z 2025-12-04T15:04:54.6513464Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6513800Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6514066Z 2025-12-04T15:04:54.6514157Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6514280Z 2025-12-04T15:04:54.6514283Z 2025-12-04T15:04:54.6514363Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6514565Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6514918Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9be31e2ef266a3ef.xml - 2025-12-04T15:04:54.6515245Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6515585Z FAILED [9.3195s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6515909Z Traceback (most recent call last): 2025-12-04T15:04:54.6516151Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6516392Z getattr(self, test_name)() 2025-12-04T15:04:54.6516639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6517173Z fn() 2025-12-04T15:04:54.6517390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6517618Z method(*args, **kwargs) 2025-12-04T15:04:54.6517838Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6518065Z method(*args, **kwargs) 2025-12-04T15:04:54.6518283Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6518508Z with policy(): 2025-12-04T15:04:54.6518739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6518970Z raise RuntimeError(msg) 2025-12-04T15:04:54.6519401Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T15:04:54.6519780Z 2025-12-04T15:04:54.6519856Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6520237Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6520500Z 2025-12-04T15:04:54.6520590Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6520780Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6520946Z ======================= 1 failed, 26 deselected in 9.48s ======================= 2025-12-04T15:04:54.6521087Z Got exit code 1 2025-12-04T15:04:54.6521319Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T15:04:54.6521654Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.6522003Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-69ab8b279cda0dc0.xml 2025-12-04T15:04:54.6522283Z ============================= test session starts ============================== 2025-12-04T15:04:54.6522493Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6522682Z cachedir: .pytest_cache 2025-12-04T15:04:54.6522909Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6523147Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6523266Z configfile: pytest.ini 2025-12-04T15:04:54.6523490Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6523763Z collecting ... collected 60 items / 1 deselected / 59 selected 2025-12-04T15:04:54.6523927Z stepcurrent: skipping 1 already run items. 2025-12-04T15:04:54.6524055Z Running 26 items in this shard 2025-12-04T15:04:54.6524127Z 2025-12-04T15:04:54.6524437Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda I1204 14:38:36.200000 414894 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 414963 2025-12-04T15:04:54.6524934Z I1204 14:38:36.200000 414894 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 414964 2025-12-04T15:04:54.6525279Z I1204 14:38:36.201000 414894 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 414965 2025-12-04T15:04:54.6525652Z I1204 14:38:36.201000 414894 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 414966 2025-12-04T15:04:54.6526212Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6526649Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6527099Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6527531Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6527959Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6528390Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6528979Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6529562Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6530146Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6530771Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6531350Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6531926Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6532371Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6532806Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6533376Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6533952Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6534191Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6534533Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6535023Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6535538Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6536016Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6536463Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6536916Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6537379Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6537858Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6538322Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6538787Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6539241Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6539694Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6540158Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6540860Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6541480Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6541828Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6542416Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6542918Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6543279Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6543692Z [rank0]:E1204 14:38:43.357000 414963 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6543936Z dist init r=0, world=4 2025-12-04T15:04:54.6544140Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6544499Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6544999Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6545477Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6545964Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6546410Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6546848Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6547324Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6547784Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6548248Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6548706Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6549156Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6549610Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6550073Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6550766Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T15:04:54.6551383Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6551732Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6552314Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6552814Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6553175Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6553600Z [rank2]:E1204 14:38:43.360000 414965 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6553854Z dist init r=2, world=4 2025-12-04T15:04:54.6554056Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6554392Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6596378Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6597119Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6597600Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6598075Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6598515Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6598976Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6599439Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6599897Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6600407Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6600856Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6601305Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6601768Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6602434Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T15:04:54.6603053Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6603399Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6603989Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6604510Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6604894Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6605306Z [rank1]:E1204 14:38:43.408000 414964 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6605554Z dist init r=1, world=4 2025-12-04T15:04:54.6605760Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6606114Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6606599Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6607078Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6607564Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6608007Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6608447Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6608905Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6609361Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6609816Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6610317Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6610766Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6611218Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6611680Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6612336Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T15:04:54.6612953Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6613300Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6613902Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6614419Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6614781Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6615187Z [rank3]:E1204 14:38:43.417000 414966 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6615439Z dist init r=3, world=4 2025-12-04T15:04:54.6615838Z [rank0]:[W1204 14:38:43.061454204 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6616245Z FAILED [9.1231s] [ 3%] 2025-12-04T15:04:54.6616308Z 2025-12-04T15:04:54.6616381Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6616575Z ___ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda ____ 2025-12-04T15:04:54.6616751Z Traceback (most recent call last): 2025-12-04T15:04:54.6616994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6617236Z self._join_processes(fn) 2025-12-04T15:04:54.6617478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6617739Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6618001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6618258Z raise RuntimeError(error) 2025-12-04T15:04:54.6618408Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6618567Z Traceback (most recent call last): 2025-12-04T15:04:54.6618800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6619036Z getattr(self, test_name)() 2025-12-04T15:04:54.6619261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6619490Z fn() 2025-12-04T15:04:54.6619690Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6619917Z method(*args, **kwargs) 2025-12-04T15:04:54.6620134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6620393Z method(*args, **kwargs) 2025-12-04T15:04:54.6620608Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6620833Z with policy(): 2025-12-04T15:04:54.6621042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6621268Z raise RuntimeError(msg) 2025-12-04T15:04:54.6621682Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6622063Z 2025-12-04T15:04:54.6622139Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6622493Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6622775Z 2025-12-04T15:04:54.6622868Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6622992Z 2025-12-04T15:04:54.6623053Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.6623190Z Traceback (most recent call last): 2025-12-04T15:04:54.6623425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6623665Z getattr(self, test_name)() 2025-12-04T15:04:54.6623907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6624136Z fn() 2025-12-04T15:04:54.6624332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6624558Z method(*args, **kwargs) 2025-12-04T15:04:54.6624773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6624997Z method(*args, **kwargs) 2025-12-04T15:04:54.6625224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6625446Z with policy(): 2025-12-04T15:04:54.6625653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6625881Z raise RuntimeError(msg) 2025-12-04T15:04:54.6626301Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T15:04:54.6626683Z 2025-12-04T15:04:54.6626757Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6627100Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6627364Z 2025-12-04T15:04:54.6627452Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6627575Z 2025-12-04T15:04:54.6627577Z 2025-12-04T15:04:54.6627657Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6627859Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6628217Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-69ab8b279cda0dc0.xml - 2025-12-04T15:04:54.6628547Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6628891Z FAILED [9.1231s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6629212Z Traceback (most recent call last): 2025-12-04T15:04:54.6629452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6629690Z getattr(self, test_name)() 2025-12-04T15:04:54.6629916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6630144Z fn() 2025-12-04T15:04:54.6630402Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6630626Z method(*args, **kwargs) 2025-12-04T15:04:54.6630839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6631105Z method(*args, **kwargs) 2025-12-04T15:04:54.6631319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6631540Z with policy(): 2025-12-04T15:04:54.6631746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6631972Z raise RuntimeError(msg) 2025-12-04T15:04:54.6632400Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6632778Z 2025-12-04T15:04:54.6632852Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6633184Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6633446Z 2025-12-04T15:04:54.6633534Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6633672Z 2025-12-04T15:04:54.6633731Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.6633869Z Traceback (most recent call last): 2025-12-04T15:04:54.6634105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6634345Z getattr(self, test_name)() 2025-12-04T15:04:54.6634574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6634801Z fn() 2025-12-04T15:04:54.6634996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6635221Z method(*args, **kwargs) 2025-12-04T15:04:54.6635435Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6635660Z method(*args, **kwargs) 2025-12-04T15:04:54.6635872Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6636093Z with policy(): 2025-12-04T15:04:54.6636297Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6636524Z raise RuntimeError(msg) 2025-12-04T15:04:54.6636941Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T15:04:54.6637320Z 2025-12-04T15:04:54.6637394Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6637727Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6637985Z 2025-12-04T15:04:54.6638072Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6638254Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6638417Z ======================= 1 failed, 1 deselected in 9.28s ======================== 2025-12-04T15:04:54.6638552Z Got exit code 1 2025-12-04T15:04:54.6638645Z Retrying single test... 2025-12-04T15:04:54.6638896Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9313e122cafdc216.xml 2025-12-04T15:04:54.6639174Z ============================= test session starts ============================== 2025-12-04T15:04:54.6639396Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6639596Z cachedir: .pytest_cache 2025-12-04T15:04:54.6639817Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6640053Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6640202Z configfile: pytest.ini 2025-12-04T15:04:54.6640426Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6640696Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6641046Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6641338Z Running 1 items in this shard 2025-12-04T15:04:54.6641410Z 2025-12-04T15:04:54.6641717Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda I1204 14:38:47.727000 415296 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 415365 2025-12-04T15:04:54.6642221Z I1204 14:38:47.728000 415296 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 415366 2025-12-04T15:04:54.6642558Z I1204 14:38:47.728000 415296 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 415367 2025-12-04T15:04:54.6642890Z I1204 14:38:47.729000 415296 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 415368 2025-12-04T15:04:54.6643434Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6643868Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6644443Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6645019Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6645464Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6645892Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6646319Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6646749Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6647309Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6647882Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6648453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6649056Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6649497Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6649928Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6650530Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6651107Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6651356Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6651692Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6652177Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6652653Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6653126Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6653577Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6654013Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6654474Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6654932Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6655394Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6655854Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6656300Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6656750Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6657211Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6657872Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6658524Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6658875Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6659473Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6659978Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6660375Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6660802Z [rank0]:E1204 14:38:55.092000 415365 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6661042Z dist init r=0, world=4 2025-12-04T15:04:54.6661250Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6661583Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6662071Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6662543Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6663014Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6663456Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6663892Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6664351Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6664810Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6665267Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6665724Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6666172Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6666622Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6667126Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6667782Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T15:04:54.6668398Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6668760Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6669346Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6669859Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6670262Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6670671Z [rank3]:E1204 14:38:55.099000 415368 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6670907Z dist init r=3, world=4 2025-12-04T15:04:54.6671104Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6671437Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6671926Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6672401Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6672875Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6673318Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6673751Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6674211Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6674670Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6675130Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6675589Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6676048Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6676515Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6676974Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6677638Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T15:04:54.6678253Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6678596Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6679193Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6679690Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6680050Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6680504Z [rank2]:E1204 14:38:55.101000 415367 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6680743Z dist init r=2, world=4 2025-12-04T15:04:54.6680943Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6681275Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6681760Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6682236Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6682706Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6683149Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6683581Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6684038Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6684496Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6684952Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6685424Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6685883Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6686328Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6686802Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6687456Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T15:04:54.6688080Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6688425Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6689004Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6689501Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6689860Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6690296Z [rank1]:E1204 14:38:55.102000 415366 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6690532Z dist init r=1, world=4 2025-12-04T15:04:54.6690927Z [rank0]:[W1204 14:38:55.767123902 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6691334Z FAILED [9.2221s] [100%] 2025-12-04T15:04:54.6691397Z 2025-12-04T15:04:54.6691459Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6691651Z ___ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda ____ 2025-12-04T15:04:54.6691827Z Traceback (most recent call last): 2025-12-04T15:04:54.6692073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6692311Z self._join_processes(fn) 2025-12-04T15:04:54.6692557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6692816Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6693080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6693336Z raise RuntimeError(error) 2025-12-04T15:04:54.6693484Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6693643Z Traceback (most recent call last): 2025-12-04T15:04:54.6693877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6694132Z getattr(self, test_name)() 2025-12-04T15:04:54.6694375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6694608Z fn() 2025-12-04T15:04:54.6694812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6695039Z method(*args, **kwargs) 2025-12-04T15:04:54.6695258Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6695483Z method(*args, **kwargs) 2025-12-04T15:04:54.6695715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6695941Z with policy(): 2025-12-04T15:04:54.6696150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6696388Z raise RuntimeError(msg) 2025-12-04T15:04:54.6696823Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6697202Z 2025-12-04T15:04:54.6697276Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6697608Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6697869Z 2025-12-04T15:04:54.6697961Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6698084Z 2025-12-04T15:04:54.6698085Z 2025-12-04T15:04:54.6698164Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6698365Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6698718Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9313e122cafdc216.xml - 2025-12-04T15:04:54.6699041Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6699384Z FAILED [9.2221s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6699702Z Traceback (most recent call last): 2025-12-04T15:04:54.6699944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6700220Z getattr(self, test_name)() 2025-12-04T15:04:54.6700449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6700680Z fn() 2025-12-04T15:04:54.6700874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6701103Z method(*args, **kwargs) 2025-12-04T15:04:54.6701320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6701545Z method(*args, **kwargs) 2025-12-04T15:04:54.6701758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6701979Z with policy(): 2025-12-04T15:04:54.6702186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6702412Z raise RuntimeError(msg) 2025-12-04T15:04:54.6702842Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6703243Z 2025-12-04T15:04:54.6703316Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6703650Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6703915Z 2025-12-04T15:04:54.6704003Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6704204Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6704373Z ======================= 1 failed, 26 deselected in 9.38s ======================= 2025-12-04T15:04:54.6704508Z Got exit code 1 2025-12-04T15:04:54.6704604Z Retrying single test... 2025-12-04T15:04:54.6704854Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f5a74cd97ab9c392.xml 2025-12-04T15:04:54.6705150Z ============================= test session starts ============================== 2025-12-04T15:04:54.6705360Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6705544Z cachedir: .pytest_cache 2025-12-04T15:04:54.6705762Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6705995Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6706113Z configfile: pytest.ini 2025-12-04T15:04:54.6706335Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6706607Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6706936Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6707232Z Running 1 items in this shard 2025-12-04T15:04:54.6707306Z 2025-12-04T15:04:54.6707613Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda I1204 14:38:59.533000 415698 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 415767 2025-12-04T15:04:54.6708104Z I1204 14:38:59.534000 415698 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 415768 2025-12-04T15:04:54.6708443Z I1204 14:38:59.534000 415698 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 415769 2025-12-04T15:04:54.6708776Z I1204 14:38:59.535000 415698 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 415770 2025-12-04T15:04:54.6709322Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6709756Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6710365Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6710942Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6711388Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6711851Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6712416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6712993Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6713458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6713888Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6714325Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6714749Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6715311Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6715883Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6716457Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6717034Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6717270Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6717610Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6718098Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6718574Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6719054Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6719496Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6719935Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6720441Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6720913Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6721386Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6721847Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6722299Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6722766Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6723233Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6723906Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T15:04:54.6724527Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6724873Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6725460Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6725966Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6726326Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6726737Z [rank1]:E1204 14:39:06.666000 415768 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6727076Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6727416Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6727907Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6728383Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6728853Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6729295Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6729730Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6730257Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6730717Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6731177Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6731658Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6732106Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6732560Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6733036Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6733693Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T15:04:54.6734309Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6734655Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6735242Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6735741Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6736101Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6736519Z [rank2]:E1204 14:39:06.666000 415769 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6736761Z dist init r=1, world=4 2025-12-04T15:04:54.6736863Z dist init r=2, world=4 2025-12-04T15:04:54.6737063Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6737397Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6737882Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6738355Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6738827Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6739284Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6739747Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6740239Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6740723Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6741181Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6741637Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6742099Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6742547Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6743008Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6743664Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6744280Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6744625Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6745207Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6745322Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6745532Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6745700Z [rank0]:E1204 14:39:06.673000 415767 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6745739Z dist init r=0, world=4 2025-12-04T15:04:54.6745878Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6746035Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6746329Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6746480Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6746792Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6746913Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6747190Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6747346Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6747620Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6747767Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6748049Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6748188Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6748469Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6748619Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6749088Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T15:04:54.6749201Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6749397Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6749748Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6749863Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6750072Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6750275Z [rank3]:E1204 14:39:06.717000 415770 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6750313Z dist init r=3, world=4 2025-12-04T15:04:54.6750647Z [rank0]:[W1204 14:39:06.367568939 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6750702Z FAILED [9.1199s] [100%] 2025-12-04T15:04:54.6750704Z 2025-12-04T15:04:54.6750762Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6750876Z ___ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda ____ 2025-12-04T15:04:54.6750923Z Traceback (most recent call last): 2025-12-04T15:04:54.6751085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6751128Z self._join_processes(fn) 2025-12-04T15:04:54.6751306Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6751359Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6751546Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6751590Z raise RuntimeError(error) 2025-12-04T15:04:54.6751672Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6751719Z Traceback (most recent call last): 2025-12-04T15:04:54.6751891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6751933Z getattr(self, test_name)() 2025-12-04T15:04:54.6752093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6752129Z fn() 2025-12-04T15:04:54.6752281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6752325Z method(*args, **kwargs) 2025-12-04T15:04:54.6752478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6752518Z method(*args, **kwargs) 2025-12-04T15:04:54.6752672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6752709Z with policy(): 2025-12-04T15:04:54.6752864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6752906Z raise RuntimeError(msg) 2025-12-04T15:04:54.6753255Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6753257Z 2025-12-04T15:04:54.6753333Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6753559Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6753562Z 2025-12-04T15:04:54.6753651Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6753654Z 2025-12-04T15:04:54.6753714Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6753766Z Traceback (most recent call last): 2025-12-04T15:04:54.6753929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6753972Z getattr(self, test_name)() 2025-12-04T15:04:54.6754129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6754169Z fn() 2025-12-04T15:04:54.6754319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6754364Z method(*args, **kwargs) 2025-12-04T15:04:54.6754513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6754585Z method(*args, **kwargs) 2025-12-04T15:04:54.6754733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6754773Z with policy(): 2025-12-04T15:04:54.6754924Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6754967Z raise RuntimeError(msg) 2025-12-04T15:04:54.6755323Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T15:04:54.6755326Z 2025-12-04T15:04:54.6755403Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6755629Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6755635Z 2025-12-04T15:04:54.6755732Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6755734Z 2025-12-04T15:04:54.6755795Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.6755840Z Traceback (most recent call last): 2025-12-04T15:04:54.6756004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6756046Z getattr(self, test_name)() 2025-12-04T15:04:54.6756207Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6756242Z fn() 2025-12-04T15:04:54.6756394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6756433Z method(*args, **kwargs) 2025-12-04T15:04:54.6756585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6756626Z method(*args, **kwargs) 2025-12-04T15:04:54.6756778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6756813Z with policy(): 2025-12-04T15:04:54.6756963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6757003Z raise RuntimeError(msg) 2025-12-04T15:04:54.6757356Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T15:04:54.6757359Z 2025-12-04T15:04:54.6757430Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6757661Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6757664Z 2025-12-04T15:04:54.6757748Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6757754Z 2025-12-04T15:04:54.6757756Z 2025-12-04T15:04:54.6757832Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6757921Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6758155Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f5a74cd97ab9c392.xml - 2025-12-04T15:04:54.6758218Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6758470Z FAILED [9.1199s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6758531Z Traceback (most recent call last): 2025-12-04T15:04:54.6758692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6758738Z getattr(self, test_name)() 2025-12-04T15:04:54.6758896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6758930Z fn() 2025-12-04T15:04:54.6759088Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6759132Z method(*args, **kwargs) 2025-12-04T15:04:54.6759282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6759326Z method(*args, **kwargs) 2025-12-04T15:04:54.6759475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6759521Z with policy(): 2025-12-04T15:04:54.6759675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6759718Z raise RuntimeError(msg) 2025-12-04T15:04:54.6760064Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T15:04:54.6760067Z 2025-12-04T15:04:54.6760141Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6760408Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6760411Z 2025-12-04T15:04:54.6760499Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6760501Z 2025-12-04T15:04:54.6760565Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6762418Z Traceback (most recent call last): 2025-12-04T15:04:54.6762582Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6762625Z getattr(self, test_name)() 2025-12-04T15:04:54.6762788Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6762823Z fn() 2025-12-04T15:04:54.6762974Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6763015Z method(*args, **kwargs) 2025-12-04T15:04:54.6763172Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6763212Z method(*args, **kwargs) 2025-12-04T15:04:54.6763363Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6763398Z with policy(): 2025-12-04T15:04:54.6763551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6763591Z raise RuntimeError(msg) 2025-12-04T15:04:54.6763940Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T15:04:54.6763964Z 2025-12-04T15:04:54.6764039Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6764282Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6764285Z 2025-12-04T15:04:54.6764373Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6764375Z 2025-12-04T15:04:54.6764433Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.6764479Z Traceback (most recent call last): 2025-12-04T15:04:54.6764641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6764704Z getattr(self, test_name)() 2025-12-04T15:04:54.6764863Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6764902Z fn() 2025-12-04T15:04:54.6765054Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6765095Z method(*args, **kwargs) 2025-12-04T15:04:54.6765257Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6765301Z method(*args, **kwargs) 2025-12-04T15:04:54.6765448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6765487Z with policy(): 2025-12-04T15:04:54.6765635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6765681Z raise RuntimeError(msg) 2025-12-04T15:04:54.6766021Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T15:04:54.6766025Z 2025-12-04T15:04:54.6766101Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6766327Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6766332Z 2025-12-04T15:04:54.6766418Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6766483Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6766546Z ======================= 1 failed, 26 deselected in 9.26s ======================= 2025-12-04T15:04:54.6766587Z Got exit code 1 2025-12-04T15:04:54.6766759Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda 2025-12-04T15:04:54.6766888Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.6767075Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d9e21bdba762758b.xml 2025-12-04T15:04:54.6767137Z ============================= test session starts ============================== 2025-12-04T15:04:54.6767250Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6767297Z cachedir: .pytest_cache 2025-12-04T15:04:54.6767455Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6767508Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6767550Z configfile: pytest.ini 2025-12-04T15:04:54.6767714Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6767797Z collecting ... collected 60 items / 2 deselected / 58 selected 2025-12-04T15:04:54.6767866Z stepcurrent: skipping 2 already run items. 2025-12-04T15:04:54.6767908Z Running 25 items in this shard 2025-12-04T15:04:54.6767911Z 2025-12-04T15:04:54.6768235Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda I1204 14:39:11.214000 416100 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 416169 2025-12-04T15:04:54.6768388Z I1204 14:39:11.215000 416100 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 416170 2025-12-04T15:04:54.6768551Z I1204 14:39:11.215000 416100 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 416171 2025-12-04T15:04:54.6768699Z I1204 14:39:11.216000 416100 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 416172 2025-12-04T15:04:54.6769077Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6769130Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6769415Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6769485Z {} 2025-12-04T15:04:54.6769587Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6769661Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6770149Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6770273Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6770624Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6770671Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6770961Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6771025Z {} 2025-12-04T15:04:54.6771128Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6771201Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6771690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6771748Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6772101Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6772180Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6772463Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6772522Z {} 2025-12-04T15:04:54.6772623Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6772693Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6773186Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6773246Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6773608Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6773654Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6773938Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6773998Z {} 2025-12-04T15:04:54.6774101Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6774181Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6774672Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6774736Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6774881Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6775051Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6775348Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6775507Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6775798Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6775925Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6776212Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6776370Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6776661Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6776812Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6777089Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6777241Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6777518Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6777684Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6778177Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T15:04:54.6778298Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6778501Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6778881Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6779001Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6779214Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6779386Z [rank3]:E1204 14:39:17.443000 416172 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6779428Z dist init r=3, world=4 2025-12-04T15:04:54.6779572Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6779733Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6780025Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6780213Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6780505Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6780635Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6780942Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6781095Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6781371Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6781535Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6781811Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6781951Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6782241Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6782394Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6782890Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T15:04:54.6783006Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6783207Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6783581Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6783702Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6783914Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6784085Z [rank1]:E1204 14:39:17.456000 416170 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6784130Z dist init r=1, world=4 2025-12-04T15:04:54.6784272Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6784437Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6784725Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6784885Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6785178Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6785321Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6785599Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6785752Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6786047Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6786195Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6786485Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6786622Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6786903Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6787052Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6787548Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T15:04:54.6787667Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6787864Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6788243Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6788359Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6788573Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6788734Z [rank0]:E1204 14:39:17.466000 416169 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6788774Z dist init r=0, world=4 2025-12-04T15:04:54.6788909Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6789068Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6789352Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6789527Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6789811Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6789936Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6790265Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6790415Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6790708Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6790856Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6791137Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6791280Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6791558Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6791713Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6792201Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T15:04:54.6792321Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6792517Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6792896Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6793015Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6793226Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6793394Z [rank2]:E1204 14:39:17.530000 416171 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6793434Z dist init r=2, world=4 2025-12-04T15:04:54.6793787Z [rank0]:[W1204 14:39:17.257149614 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6793841Z FAILED [8.0169s] [ 4%] 2025-12-04T15:04:54.6793843Z 2025-12-04T15:04:54.6793906Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6794020Z _ TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda _ 2025-12-04T15:04:54.6794069Z Traceback (most recent call last): 2025-12-04T15:04:54.6794234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6794293Z self._join_processes(fn) 2025-12-04T15:04:54.6794468Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6794528Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6794708Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6794766Z raise RuntimeError(error) 2025-12-04T15:04:54.6794849Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6794900Z Traceback (most recent call last): 2025-12-04T15:04:54.6795062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6795109Z getattr(self, test_name)() 2025-12-04T15:04:54.6795266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6795301Z fn() 2025-12-04T15:04:54.6795452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6795499Z method(*args, **kwargs) 2025-12-04T15:04:54.6795653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6795700Z method(*args, **kwargs) 2025-12-04T15:04:54.6795852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6795895Z with policy(): 2025-12-04T15:04:54.6796052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6796095Z raise RuntimeError(msg) 2025-12-04T15:04:54.6796467Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T15:04:54.6796470Z 2025-12-04T15:04:54.6796546Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6796799Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6796802Z 2025-12-04T15:04:54.6796892Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6796894Z 2025-12-04T15:04:54.6796896Z 2025-12-04T15:04:54.6796977Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6797066Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6797307Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d9e21bdba762758b.xml - 2025-12-04T15:04:54.6797371Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6797647Z FAILED [8.0169s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6797710Z Traceback (most recent call last): 2025-12-04T15:04:54.6797875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6797924Z getattr(self, test_name)() 2025-12-04T15:04:54.6798085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6798126Z fn() 2025-12-04T15:04:54.6798287Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6798334Z method(*args, **kwargs) 2025-12-04T15:04:54.6798485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6798533Z method(*args, **kwargs) 2025-12-04T15:04:54.6798692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6798737Z with policy(): 2025-12-04T15:04:54.6798888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6798935Z raise RuntimeError(msg) 2025-12-04T15:04:54.6799301Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T15:04:54.6799303Z 2025-12-04T15:04:54.6799383Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6799631Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6799634Z 2025-12-04T15:04:54.6799729Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6799794Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6799863Z ======================= 1 failed, 2 deselected in 8.16s ======================== 2025-12-04T15:04:54.6799906Z Got exit code 1 2025-12-04T15:04:54.6799950Z Retrying single test... 2025-12-04T15:04:54.6800144Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3a63bc5300e41deb.xml 2025-12-04T15:04:54.6800237Z ============================= test session starts ============================== 2025-12-04T15:04:54.6800352Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6800396Z cachedir: .pytest_cache 2025-12-04T15:04:54.6800559Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6800607Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6800654Z configfile: pytest.ini 2025-12-04T15:04:54.6800816Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6800897Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6801139Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6801185Z Running 1 items in this shard 2025-12-04T15:04:54.6801187Z 2025-12-04T15:04:54.6801508Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda I1204 14:39:21.792000 416486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 416555 2025-12-04T15:04:54.6801708Z I1204 14:39:21.792000 416486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 416556 2025-12-04T15:04:54.6801861Z I1204 14:39:21.793000 416486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 416557 2025-12-04T15:04:54.6802019Z I1204 14:39:21.793000 416486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 416558 2025-12-04T15:04:54.6802395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6802452Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6802744Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6802822Z {} 2025-12-04T15:04:54.6802932Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6803007Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6803504Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6803568Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6803927Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6803978Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6804270Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6804334Z {} 2025-12-04T15:04:54.6804445Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6804519Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6805011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6805078Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6805432Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6805486Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6805838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6805909Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6806194Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6806260Z {} 2025-12-04T15:04:54.6806364Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6806442Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6806943Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6807005Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6807303Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6807366Z {} 2025-12-04T15:04:54.6807473Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6807542Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6808031Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6808092Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6808242Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6808407Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6808700Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6808861Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6809149Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6809281Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6809559Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6809712Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6809989Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6810152Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6810483Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6810619Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6810917Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6811067Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6811576Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T15:04:54.6811695Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6811896Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6812276Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6812391Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6812608Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6812773Z [rank3]:E1204 14:39:27.842000 416558 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6812817Z dist init r=3, world=4 2025-12-04T15:04:54.6812957Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6813123Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6813411Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6813573Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6813860Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6813991Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6814275Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6814424Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6814729Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6814877Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6815156Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6815304Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6815587Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6815751Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6816242Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T15:04:54.6816363Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6816559Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6816938Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6817053Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6817270Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6817438Z [rank2]:E1204 14:39:27.847000 416557 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6817479Z dist init r=2, world=4 2025-12-04T15:04:54.6817623Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6817785Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6818076Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6818230Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6818519Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6818643Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6819049Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6819208Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6819488Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6819656Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6819931Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6820075Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6820408Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6820560Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6821049Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T15:04:54.6821168Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6821362Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6821732Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6821849Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6822056Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6822224Z [rank1]:E1204 14:39:27.848000 416556 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6822266Z dist init r=1, world=4 2025-12-04T15:04:54.6822409Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6822568Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6822859Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6823012Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6823313Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6823457Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6823735Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6823888Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6824177Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6824331Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6824618Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6824760Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6825043Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6825191Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6825687Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T15:04:54.6825803Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6826004Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6826374Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6826493Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6826710Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6826873Z [rank0]:E1204 14:39:27.858000 416555 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6826918Z dist init r=0, world=4 2025-12-04T15:04:54.6827255Z [rank0]:[W1204 14:39:28.628654271 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6827302Z FAILED [7.9177s] [100%] 2025-12-04T15:04:54.6827304Z 2025-12-04T15:04:54.6827372Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6827498Z _ TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda _ 2025-12-04T15:04:54.6827546Z Traceback (most recent call last): 2025-12-04T15:04:54.6827715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6827761Z self._join_processes(fn) 2025-12-04T15:04:54.6827939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6827995Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6828189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6828235Z raise RuntimeError(error) 2025-12-04T15:04:54.6828320Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6828368Z Traceback (most recent call last): 2025-12-04T15:04:54.6828535Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6828575Z getattr(self, test_name)() 2025-12-04T15:04:54.6828748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6828783Z fn() 2025-12-04T15:04:54.6828932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6828974Z method(*args, **kwargs) 2025-12-04T15:04:54.6829124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6829166Z method(*args, **kwargs) 2025-12-04T15:04:54.6829317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6829354Z with policy(): 2025-12-04T15:04:54.6829509Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6829552Z raise RuntimeError(msg) 2025-12-04T15:04:54.6829914Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T15:04:54.6829917Z 2025-12-04T15:04:54.6829993Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6830274Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6830277Z 2025-12-04T15:04:54.6830367Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6830369Z 2025-12-04T15:04:54.6830428Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.6830477Z Traceback (most recent call last): 2025-12-04T15:04:54.6830638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6830680Z getattr(self, test_name)() 2025-12-04T15:04:54.6830835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6830872Z fn() 2025-12-04T15:04:54.6831022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6831064Z method(*args, **kwargs) 2025-12-04T15:04:54.6831215Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6831272Z method(*args, **kwargs) 2025-12-04T15:04:54.6831434Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6831473Z with policy(): 2025-12-04T15:04:54.6831623Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6831666Z raise RuntimeError(msg) 2025-12-04T15:04:54.6832041Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T15:04:54.6832044Z 2025-12-04T15:04:54.6832116Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6832359Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6832363Z 2025-12-04T15:04:54.6832450Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6832476Z 2025-12-04T15:04:54.6832536Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6832580Z Traceback (most recent call last): 2025-12-04T15:04:54.6840466Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6840514Z getattr(self, test_name)() 2025-12-04T15:04:54.6840686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6840721Z fn() 2025-12-04T15:04:54.6840875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6840917Z method(*args, **kwargs) 2025-12-04T15:04:54.6841071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6841110Z method(*args, **kwargs) 2025-12-04T15:04:54.6841261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6841298Z with policy(): 2025-12-04T15:04:54.6841449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6841488Z raise RuntimeError(msg) 2025-12-04T15:04:54.6841853Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T15:04:54.6841857Z 2025-12-04T15:04:54.6841933Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6842179Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6842181Z 2025-12-04T15:04:54.6842269Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6842271Z 2025-12-04T15:04:54.6842273Z 2025-12-04T15:04:54.6842351Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6842438Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6842673Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3a63bc5300e41deb.xml - 2025-12-04T15:04:54.6842735Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6843031Z FAILED [7.9177s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6843099Z Traceback (most recent call last): 2025-12-04T15:04:54.6843262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6843305Z getattr(self, test_name)() 2025-12-04T15:04:54.6843462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6843498Z fn() 2025-12-04T15:04:54.6843661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6843702Z method(*args, **kwargs) 2025-12-04T15:04:54.6843852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6843894Z method(*args, **kwargs) 2025-12-04T15:04:54.6844065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6844103Z with policy(): 2025-12-04T15:04:54.6844253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6844293Z raise RuntimeError(msg) 2025-12-04T15:04:54.6844660Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T15:04:54.6844662Z 2025-12-04T15:04:54.6844736Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6844986Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6844989Z 2025-12-04T15:04:54.6845076Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6845078Z 2025-12-04T15:04:54.6845137Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.6845181Z Traceback (most recent call last): 2025-12-04T15:04:54.6845346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6845388Z getattr(self, test_name)() 2025-12-04T15:04:54.6845548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6845581Z fn() 2025-12-04T15:04:54.6845731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6845772Z method(*args, **kwargs) 2025-12-04T15:04:54.6845921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6845961Z method(*args, **kwargs) 2025-12-04T15:04:54.6846111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6846148Z with policy(): 2025-12-04T15:04:54.6846302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6846344Z raise RuntimeError(msg) 2025-12-04T15:04:54.6846708Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T15:04:54.6846736Z 2025-12-04T15:04:54.6846810Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6847054Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6847056Z 2025-12-04T15:04:54.6847142Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6847145Z 2025-12-04T15:04:54.6847202Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6847246Z Traceback (most recent call last): 2025-12-04T15:04:54.6847418Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6847460Z getattr(self, test_name)() 2025-12-04T15:04:54.6847710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6847748Z fn() 2025-12-04T15:04:54.6847897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6847948Z method(*args, **kwargs) 2025-12-04T15:04:54.6848098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6848137Z method(*args, **kwargs) 2025-12-04T15:04:54.6848286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6848325Z with policy(): 2025-12-04T15:04:54.6848477Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6848518Z raise RuntimeError(msg) 2025-12-04T15:04:54.6848881Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T15:04:54.6848885Z 2025-12-04T15:04:54.6848958Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6849202Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6849204Z 2025-12-04T15:04:54.6849289Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6849355Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6849419Z ======================= 1 failed, 26 deselected in 8.08s ======================= 2025-12-04T15:04:54.6849459Z Got exit code 1 2025-12-04T15:04:54.6849499Z Retrying single test... 2025-12-04T15:04:54.6849691Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-edb3d5fdada93d98.xml 2025-12-04T15:04:54.6849751Z ============================= test session starts ============================== 2025-12-04T15:04:54.6849867Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6849907Z cachedir: .pytest_cache 2025-12-04T15:04:54.6850066Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6850113Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6850154Z configfile: pytest.ini 2025-12-04T15:04:54.6850360Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6850436Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6850691Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6850751Z Running 1 items in this shard 2025-12-04T15:04:54.6850754Z 2025-12-04T15:04:54.6851074Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda I1204 14:39:32.166000 416872 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 416941 2025-12-04T15:04:54.6851227Z I1204 14:39:32.167000 416872 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 416942 2025-12-04T15:04:54.6851393Z I1204 14:39:32.167000 416872 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 416943 2025-12-04T15:04:54.6851541Z I1204 14:39:32.168000 416872 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 416944 2025-12-04T15:04:54.6851912Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6851962Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6852251Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6852315Z {} 2025-12-04T15:04:54.6852420Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6852493Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6852983Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6853045Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6853396Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6853445Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6853792Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6853840Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6854124Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6854187Z {} 2025-12-04T15:04:54.6854289Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6854361Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6854710Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.6854769Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.6855270Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6855330Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6855634Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6855694Z {} 2025-12-04T15:04:54.6855797Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6855868Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6856360Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6856419Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6856702Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.6856762Z {} 2025-12-04T15:04:54.6856861Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.6856933Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.6857420Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.6857478Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.6857623Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6857786Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6858075Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6858232Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6858517Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6858642Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6858921Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6859080Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6859371Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6859517Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6859791Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6859936Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6860251Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6860412Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6860905Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T15:04:54.6861022Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6861216Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6861593Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6861707Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6861919Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6862083Z [rank0]:E1204 14:39:38.317000 416941 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6862124Z dist init r=0, world=4 2025-12-04T15:04:54.6862262Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6862422Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6862712Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6862865Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6863148Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6863271Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6863571Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6863717Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6863995Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6864155Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6864429Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6864565Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6864848Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6864997Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6865487Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T15:04:54.6865603Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6865799Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6866172Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6866287Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6866498Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6866664Z [rank3]:E1204 14:39:38.336000 416944 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6866703Z dist init r=3, world=4 2025-12-04T15:04:54.6866840Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6866997Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6867285Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6867438Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6867731Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6867866Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6868142Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6868300Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6868575Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6868724Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6869009Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6869146Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6869422Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6869568Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6870058Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T15:04:54.6870210Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6870404Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6870775Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6870890Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6871104Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6871265Z [rank1]:E1204 14:39:38.360000 416942 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6871305Z dist init r=1, world=4 2025-12-04T15:04:54.6871440Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6871598Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6871883Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6872069Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6872350Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6872474Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6872760Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6872908Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6873198Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6873345Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6873620Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6873756Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6874032Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6874180Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6874667Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T15:04:54.6874781Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6874973Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6875346Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6875457Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6875669Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6875831Z [rank2]:E1204 14:39:38.375000 416943 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6875870Z dist init r=2, world=4 2025-12-04T15:04:54.6876217Z [rank0]:[W1204 14:39:38.101707770 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6876268Z FAILED [7.9174s] [100%] 2025-12-04T15:04:54.6876270Z 2025-12-04T15:04:54.6876329Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6876440Z _ TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda _ 2025-12-04T15:04:54.6876486Z Traceback (most recent call last): 2025-12-04T15:04:54.6876648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6876702Z self._join_processes(fn) 2025-12-04T15:04:54.6876873Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6876928Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6877105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6877159Z raise RuntimeError(error) 2025-12-04T15:04:54.6877238Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6877283Z Traceback (most recent call last): 2025-12-04T15:04:54.6877442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6877485Z getattr(self, test_name)() 2025-12-04T15:04:54.6877642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6877677Z fn() 2025-12-04T15:04:54.6877828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6877869Z method(*args, **kwargs) 2025-12-04T15:04:54.6878020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6878061Z method(*args, **kwargs) 2025-12-04T15:04:54.6878210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6878248Z with policy(): 2025-12-04T15:04:54.6878399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6878440Z raise RuntimeError(msg) 2025-12-04T15:04:54.6878806Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T15:04:54.6878810Z 2025-12-04T15:04:54.6878885Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6879131Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6879135Z 2025-12-04T15:04:54.6879222Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6879224Z 2025-12-04T15:04:54.6879226Z 2025-12-04T15:04:54.6879303Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6879390Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6879629Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-edb3d5fdada93d98.xml - 2025-12-04T15:04:54.6879689Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6879961Z FAILED [7.9174s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.6880018Z Traceback (most recent call last): 2025-12-04T15:04:54.6880203Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6880244Z getattr(self, test_name)() 2025-12-04T15:04:54.6880403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6880438Z fn() 2025-12-04T15:04:54.6880606Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6880647Z method(*args, **kwargs) 2025-12-04T15:04:54.6880797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6880836Z method(*args, **kwargs) 2025-12-04T15:04:54.6880998Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6881034Z with policy(): 2025-12-04T15:04:54.6881186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6881225Z raise RuntimeError(msg) 2025-12-04T15:04:54.6881589Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T15:04:54.6881591Z 2025-12-04T15:04:54.6881665Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6881909Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6881912Z 2025-12-04T15:04:54.6881999Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6882062Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6882125Z ======================= 1 failed, 26 deselected in 8.06s ======================= 2025-12-04T15:04:54.6882161Z Got exit code 1 2025-12-04T15:04:54.6882357Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda 2025-12-04T15:04:54.6882486Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.6882674Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-26794ca81cb6aa09.xml 2025-12-04T15:04:54.6882733Z ============================= test session starts ============================== 2025-12-04T15:04:54.6882848Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6882888Z cachedir: .pytest_cache 2025-12-04T15:04:54.6883047Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6883092Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6883133Z configfile: pytest.ini 2025-12-04T15:04:54.6883295Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6883371Z collecting ... collected 60 items / 3 deselected / 57 selected 2025-12-04T15:04:54.6883422Z stepcurrent: skipping 3 already run items. 2025-12-04T15:04:54.6883480Z Running 24 items in this shard 2025-12-04T15:04:54.6883482Z 2025-12-04T15:04:54.6883787Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 14:39:42.904000 417258 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 417327 2025-12-04T15:04:54.6883954Z I1204 14:39:42.904000 417258 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 417328 2025-12-04T15:04:54.6884105Z I1204 14:39:42.905000 417258 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 417329 2025-12-04T15:04:54.6884254Z I1204 14:39:42.906000 417258 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 417330 2025-12-04T15:04:54.6884844Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6884892Z _warn_cpu_init() 2025-12-04T15:04:54.6885458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6885496Z _warn_cpu_init() 2025-12-04T15:04:54.6886058Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6886097Z _warn_cpu_init() 2025-12-04T15:04:54.6886659Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6886696Z _warn_cpu_init() 2025-12-04T15:04:54.6886984Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.6887028Z return func(*args, **kwargs) 2025-12-04T15:04:54.6887172Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6887334Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6887621Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6887776Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6888074Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6888210Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6888485Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6888631Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6888916Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6889063Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6889349Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6889491Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6889768Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6889915Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6890426Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.6890544Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6890739Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6891095Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6891209Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6891421Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6891585Z [rank2]:E1204 14:40:14.228000 417329 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6891723Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6891883Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6892170Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6892348Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6892632Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6892754Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6893040Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6893185Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6893460Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6893618Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6893892Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6894028Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6894305Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6894453Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6894926Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6895041Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6895234Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6895588Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6895702Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6895914Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6896077Z [rank3]:E1204 14:40:14.228000 417330 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6896116Z dist init r=2, world=4 2025-12-04T15:04:54.6896155Z dist init r=3, world=4 2025-12-04T15:04:54.6896294Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6896463Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6896759Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6896913Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6897202Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6897326Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6897601Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6897759Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6898031Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6898178Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6898451Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6898588Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6898869Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6899016Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6899485Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6899599Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6899794Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6900144Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6900288Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6900501Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6900663Z [rank1]:E1204 14:40:14.241000 417328 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6900730Z dist init r=1, world=4 2025-12-04T15:04:54.6900867Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6901027Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6901314Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6901478Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6901759Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6901883Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6902174Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6902322Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6902601Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6902747Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6903024Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6903159Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6903439Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6903589Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6904065Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.6904182Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6904378Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6904737Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6904849Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6905071Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6905242Z [rank0]:E1204 14:40:14.272000 417327 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6905281Z dist init r=0, world=4 2025-12-04T15:04:54.6905614Z [rank0]:[W1204 14:40:14.083554647 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6905657Z FAILED [33.1545s] [ 4%] 2025-12-04T15:04:54.6905669Z 2025-12-04T15:04:54.6905730Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6905828Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____ 2025-12-04T15:04:54.6905876Z Traceback (most recent call last): 2025-12-04T15:04:54.6906038Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6906092Z self._join_processes(fn) 2025-12-04T15:04:54.6906264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6906318Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6906494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6906537Z raise RuntimeError(error) 2025-12-04T15:04:54.6906616Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6906660Z Traceback (most recent call last): 2025-12-04T15:04:54.6906820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6906863Z getattr(self, test_name)() 2025-12-04T15:04:54.6907019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6907054Z fn() 2025-12-04T15:04:54.6907203Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6907243Z method(*args, **kwargs) 2025-12-04T15:04:54.6907391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6907432Z method(*args, **kwargs) 2025-12-04T15:04:54.6907581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6907618Z with policy(): 2025-12-04T15:04:54.6907769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6907812Z raise RuntimeError(msg) 2025-12-04T15:04:54.6908161Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6908163Z 2025-12-04T15:04:54.6908240Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6908467Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6908471Z 2025-12-04T15:04:54.6908559Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6908561Z 2025-12-04T15:04:54.6908572Z 2025-12-04T15:04:54.6908649Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6908746Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6908977Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-26794ca81cb6aa09.xml - 2025-12-04T15:04:54.6909038Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6909284Z FAILED [33.1545s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6909329Z Traceback (most recent call last): 2025-12-04T15:04:54.6909502Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6909544Z getattr(self, test_name)() 2025-12-04T15:04:54.6909703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6909738Z fn() 2025-12-04T15:04:54.6909899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6909939Z method(*args, **kwargs) 2025-12-04T15:04:54.6910089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6910130Z method(*args, **kwargs) 2025-12-04T15:04:54.6910314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6910351Z with policy(): 2025-12-04T15:04:54.6910504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6910544Z raise RuntimeError(msg) 2025-12-04T15:04:54.6910891Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6910896Z 2025-12-04T15:04:54.6910972Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6911197Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6911199Z 2025-12-04T15:04:54.6911286Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6911349Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6911412Z ======================= 1 failed, 3 deselected in 33.32s ======================= 2025-12-04T15:04:54.6911451Z Got exit code 1 2025-12-04T15:04:54.6911491Z Retrying single test... 2025-12-04T15:04:54.6911678Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-249cd4a50007f599.xml 2025-12-04T15:04:54.6911738Z ============================= test session starts ============================== 2025-12-04T15:04:54.6911850Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6911891Z cachedir: .pytest_cache 2025-12-04T15:04:54.6912048Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6912093Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6912132Z configfile: pytest.ini 2025-12-04T15:04:54.6912296Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6912387Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6912608Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6912665Z Running 1 items in this shard 2025-12-04T15:04:54.6912668Z 2025-12-04T15:04:54.6912968Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 14:40:18.822000 417660 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 417729 2025-12-04T15:04:54.6913121Z I1204 14:40:18.823000 417660 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 417730 2025-12-04T15:04:54.6913283Z I1204 14:40:18.824000 417660 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 417731 2025-12-04T15:04:54.6913433Z I1204 14:40:18.824000 417660 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 417732 2025-12-04T15:04:54.6914024Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6914064Z _warn_cpu_init() 2025-12-04T15:04:54.6914626Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6914667Z _warn_cpu_init() 2025-12-04T15:04:54.6915231Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6915267Z _warn_cpu_init() 2025-12-04T15:04:54.6915824Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6915861Z _warn_cpu_init() 2025-12-04T15:04:54.6916151Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.6916193Z return func(*args, **kwargs) 2025-12-04T15:04:54.6916334Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6916496Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6916780Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6916954Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6917240Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6917368Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6917668Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6917818Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6918094Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6918255Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6918529Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6918664Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6918939Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6919086Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6919561Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6919676Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6919871Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6920276Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6920390Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6920600Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6920761Z [rank1]:E1204 14:40:50.326000 417730 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6920801Z dist init r=1, world=4 2025-12-04T15:04:54.6920937Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6921109Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6921411Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6921564Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6921846Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6921981Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6922258Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6922417Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6922691Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6922836Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6923110Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6923246Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6923524Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6923671Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6924142Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6924257Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6924453Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6924805Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6924916Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6925126Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6925289Z [rank3]:E1204 14:40:50.329000 417732 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6925346Z dist init r=3, world=4 2025-12-04T15:04:54.6925485Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6925643Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6925927Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6926087Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6926369Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6926494Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6926778Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6926925Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6927199Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6927344Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6927618Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6927754Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6928028Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6928175Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6928644Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.6928761Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6928959Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6929309Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6929422Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6929640Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6929813Z [rank0]:E1204 14:40:50.383000 417729 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6929853Z dist init r=0, world=4 2025-12-04T15:04:54.6929990Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6930147Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6930501Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6930654Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6930952Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6931077Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6931351Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6931499Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6931771Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6931918Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6932190Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6932326Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6932601Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6932748Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6933217Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.6933330Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6933526Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6933876Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6934022Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6934232Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6934393Z [rank2]:E1204 14:40:50.388000 417731 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6934431Z dist init r=2, world=4 2025-12-04T15:04:54.6934775Z [rank0]:[W1204 14:40:50.165349420 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6934817Z FAILED [33.4544s] [100%] 2025-12-04T15:04:54.6934820Z 2025-12-04T15:04:54.6934874Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6934984Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____ 2025-12-04T15:04:54.6935030Z Traceback (most recent call last): 2025-12-04T15:04:54.6935192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6935235Z self._join_processes(fn) 2025-12-04T15:04:54.6935406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6935460Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6935635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6935680Z raise RuntimeError(error) 2025-12-04T15:04:54.6935760Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6935805Z Traceback (most recent call last): 2025-12-04T15:04:54.6935969Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6936010Z getattr(self, test_name)() 2025-12-04T15:04:54.6936167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6936202Z fn() 2025-12-04T15:04:54.6936354Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6936394Z method(*args, **kwargs) 2025-12-04T15:04:54.6936544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6936584Z method(*args, **kwargs) 2025-12-04T15:04:54.6936736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6936774Z with policy(): 2025-12-04T15:04:54.6936926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6936968Z raise RuntimeError(msg) 2025-12-04T15:04:54.6937312Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6937315Z 2025-12-04T15:04:54.6937391Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6937617Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6937641Z 2025-12-04T15:04:54.6937729Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6937732Z 2025-12-04T15:04:54.6937733Z 2025-12-04T15:04:54.6937807Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6937897Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6938131Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-249cd4a50007f599.xml - 2025-12-04T15:04:54.6938190Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6938441Z FAILED [33.4544s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6938489Z Traceback (most recent call last): 2025-12-04T15:04:54.6938650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6938694Z getattr(self, test_name)() 2025-12-04T15:04:54.6938861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6938896Z fn() 2025-12-04T15:04:54.6939046Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6939085Z method(*args, **kwargs) 2025-12-04T15:04:54.6939237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6939276Z method(*args, **kwargs) 2025-12-04T15:04:54.6939426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6939463Z with policy(): 2025-12-04T15:04:54.6939616Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6939656Z raise RuntimeError(msg) 2025-12-04T15:04:54.6940002Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6940004Z 2025-12-04T15:04:54.6940077Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6940333Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6940336Z 2025-12-04T15:04:54.6940422Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6940487Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6940549Z ====================== 1 failed, 26 deselected in 33.61s ======================= 2025-12-04T15:04:54.6940588Z Got exit code 1 2025-12-04T15:04:54.6940628Z Retrying single test... 2025-12-04T15:04:54.6940814Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-36c35f0611b36887.xml 2025-12-04T15:04:54.6940871Z ============================= test session starts ============================== 2025-12-04T15:04:54.6940982Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6941023Z cachedir: .pytest_cache 2025-12-04T15:04:54.6941180Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6941225Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6941286Z configfile: pytest.ini 2025-12-04T15:04:54.6941445Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6941535Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.6941753Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6941796Z Running 1 items in this shard 2025-12-04T15:04:54.6941798Z 2025-12-04T15:04:54.6942114Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 14:40:54.993000 418062 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 418131 2025-12-04T15:04:54.6942267Z I1204 14:40:54.993000 418062 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 418132 2025-12-04T15:04:54.6942418Z I1204 14:40:54.994000 418062 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 418133 2025-12-04T15:04:54.6942581Z I1204 14:40:54.994000 418062 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 418134 2025-12-04T15:04:54.6943155Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6943193Z _warn_cpu_init() 2025-12-04T15:04:54.6943754Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6943793Z _warn_cpu_init() 2025-12-04T15:04:54.6944350Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6944387Z _warn_cpu_init() 2025-12-04T15:04:54.6944947Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6944985Z _warn_cpu_init() 2025-12-04T15:04:54.6945273Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.6945316Z return func(*args, **kwargs) 2025-12-04T15:04:54.6945457Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6945617Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6945927Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6946080Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6946363Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6946495Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6946770Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6946918Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6947201Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6947350Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6947625Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6947763Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6948039Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6948186Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6948658Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6948775Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6948971Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6949324Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6949437Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6949649Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6949813Z [rank1]:E1204 14:41:26.330000 418132 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6949867Z dist init r=1, world=4 2025-12-04T15:04:54.6950015Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6950208Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6950493Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6950646Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6950941Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6951067Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6951353Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6951502Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6951777Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6951924Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6952200Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6952340Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6952617Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6952764Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6953236Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.6953350Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6953547Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6953901Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6954015Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6954240Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6954417Z [rank2]:E1204 14:41:26.343000 418133 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6954459Z dist init r=2, world=4 2025-12-04T15:04:54.6954596Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6954755Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6955049Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6955203Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6955501Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6955626Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6955902Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6956048Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6956323Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6956471Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6956748Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6956882Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6957161Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6957309Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6957781Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6957896Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6958091Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6958444Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6958576Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6958789Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6958953Z [rank3]:E1204 14:41:26.344000 418134 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6958991Z dist init r=3, world=4 2025-12-04T15:04:54.6959139Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6959301Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6959590Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6959751Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6960034Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6960156Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6960465Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6960611Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6960887Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6961032Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6961311Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6961446Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6961723Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6961871Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6962341Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.6962454Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6962662Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6963030Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6963142Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6963352Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6963532Z [rank0]:E1204 14:41:26.386000 418131 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6963570Z dist init r=0, world=4 2025-12-04T15:04:54.6963907Z [rank0]:[W1204 14:41:26.224481085 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6963966Z FAILED [33.1532s] [100%] 2025-12-04T15:04:54.6963968Z 2025-12-04T15:04:54.6964025Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6964123Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____ 2025-12-04T15:04:54.6964172Z Traceback (most recent call last): 2025-12-04T15:04:54.6964332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6964377Z self._join_processes(fn) 2025-12-04T15:04:54.6964548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6964603Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6964780Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6964826Z raise RuntimeError(error) 2025-12-04T15:04:54.6964904Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6964951Z Traceback (most recent call last): 2025-12-04T15:04:54.6965108Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6965152Z getattr(self, test_name)() 2025-12-04T15:04:54.6965310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6965346Z fn() 2025-12-04T15:04:54.6965497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6965539Z method(*args, **kwargs) 2025-12-04T15:04:54.6965692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6965733Z method(*args, **kwargs) 2025-12-04T15:04:54.6965884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6965922Z with policy(): 2025-12-04T15:04:54.6966074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6966115Z raise RuntimeError(msg) 2025-12-04T15:04:54.6966462Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6966475Z 2025-12-04T15:04:54.6966551Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6966790Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6966793Z 2025-12-04T15:04:54.6966879Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6966881Z 2025-12-04T15:04:54.6966883Z 2025-12-04T15:04:54.6966958Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6967044Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6967287Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-36c35f0611b36887.xml - 2025-12-04T15:04:54.6967348Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6967592Z FAILED [33.1532s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.6967648Z Traceback (most recent call last): 2025-12-04T15:04:54.6967810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6967853Z getattr(self, test_name)() 2025-12-04T15:04:54.6968011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6968048Z fn() 2025-12-04T15:04:54.6968199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6968242Z method(*args, **kwargs) 2025-12-04T15:04:54.6968391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6968434Z method(*args, **kwargs) 2025-12-04T15:04:54.6968583Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6968622Z with policy(): 2025-12-04T15:04:54.6968772Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6968815Z raise RuntimeError(msg) 2025-12-04T15:04:54.6969160Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6969162Z 2025-12-04T15:04:54.6969236Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6969462Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6969467Z 2025-12-04T15:04:54.6969552Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6969616Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6969678Z ====================== 1 failed, 26 deselected in 33.31s ======================= 2025-12-04T15:04:54.6969715Z Got exit code 1 2025-12-04T15:04:54.6969889Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda 2025-12-04T15:04:54.6970017Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.6970236Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ce66f313041855bd.xml 2025-12-04T15:04:54.6970311Z ============================= test session starts ============================== 2025-12-04T15:04:54.6970436Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6970479Z cachedir: .pytest_cache 2025-12-04T15:04:54.6970638Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6970683Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6970722Z configfile: pytest.ini 2025-12-04T15:04:54.6970883Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.6970955Z collecting ... collected 60 items / 4 deselected / 56 selected 2025-12-04T15:04:54.6971026Z stepcurrent: skipping 4 already run items. 2025-12-04T15:04:54.6971071Z Running 23 items in this shard 2025-12-04T15:04:54.6971073Z 2025-12-04T15:04:54.6971392Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 14:41:30.769000 418464 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 418533 2025-12-04T15:04:54.6971561Z I1204 14:41:30.770000 418464 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 418534 2025-12-04T15:04:54.6971712Z I1204 14:41:30.770000 418464 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 418535 2025-12-04T15:04:54.6971860Z I1204 14:41:30.771000 418464 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 418536 2025-12-04T15:04:54.6972435Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6972475Z _warn_cpu_init() 2025-12-04T15:04:54.6973039Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6973076Z _warn_cpu_init() 2025-12-04T15:04:54.6973636Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6973675Z _warn_cpu_init() 2025-12-04T15:04:54.6974234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.6974270Z _warn_cpu_init() 2025-12-04T15:04:54.6974559Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.6974625Z return func(*args, **kwargs) 2025-12-04T15:04:54.6974770Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6974932Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6975225Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6975387Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6975670Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6975795Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6976079Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6976228Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6976502Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6976648Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6976923Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6977062Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6977341Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6977488Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6977975Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.6978090Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6978285Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6978652Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.6978766Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6978995Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6979157Z [rank1]:E1204 14:42:01.992000 418534 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.6979196Z dist init r=1, world=4 2025-12-04T15:04:54.6979332Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6979491Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6979798Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6979954Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6980293Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6980418Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6980694Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6980840Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6981116Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6981261Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6981534Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6981669Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6981949Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6982097Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6982581Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6982696Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6982890Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6983256Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.6983395Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6983606Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6983767Z [rank3]:E1204 14:42:01.994000 418536 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.6983807Z dist init r=3, world=4 2025-12-04T15:04:54.6983955Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6984115Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6984413Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6984567Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6984851Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6984973Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6985250Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6985399Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6985675Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6985821Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6986097Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6986232Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6986508Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6986657Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6987139Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.6987265Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6987469Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6987835Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.6987949Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6988168Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6988331Z [rank0]:E1204 14:42:02.030000 418533 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.6988370Z dist init r=0, world=4 2025-12-04T15:04:54.6988507Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.6988674Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.6988960Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6989113Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.6989398Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6989521Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.6989796Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6989943Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6990253Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6990401Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.6990675Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6990811Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.6991087Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6991235Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.6991716Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.6991853Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6992046Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6992419Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.6992533Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.6992744Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6992922Z [rank2]:E1204 14:42:02.054000 418535 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.6992960Z dist init r=2, world=4 2025-12-04T15:04:54.6993293Z [rank0]:[W1204 14:42:02.772859744 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.6993335Z FAILED [33.1529s] [ 4%] 2025-12-04T15:04:54.6993338Z 2025-12-04T15:04:54.6993393Z =================================== FAILURES =================================== 2025-12-04T15:04:54.6993498Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _ 2025-12-04T15:04:54.6993546Z Traceback (most recent call last): 2025-12-04T15:04:54.6993708Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.6993752Z self._join_processes(fn) 2025-12-04T15:04:54.6993926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.6993979Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.6994157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.6994202Z raise RuntimeError(error) 2025-12-04T15:04:54.6994282Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6994326Z Traceback (most recent call last): 2025-12-04T15:04:54.6994487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6994531Z getattr(self, test_name)() 2025-12-04T15:04:54.6994692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6994727Z fn() 2025-12-04T15:04:54.6994879Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6994921Z method(*args, **kwargs) 2025-12-04T15:04:54.6995069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6995110Z method(*args, **kwargs) 2025-12-04T15:04:54.6995260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6995298Z with policy(): 2025-12-04T15:04:54.6995460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6995517Z raise RuntimeError(msg) 2025-12-04T15:04:54.6995876Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6995879Z 2025-12-04T15:04:54.6995954Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6996192Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.6996205Z 2025-12-04T15:04:54.6996292Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6996294Z 2025-12-04T15:04:54.6996296Z 2025-12-04T15:04:54.6996372Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.6996461Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.6996699Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ce66f313041855bd.xml - 2025-12-04T15:04:54.6996760Z =========================== short test summary info ============================ 2025-12-04T15:04:54.6997012Z FAILED [33.1529s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.6997058Z Traceback (most recent call last): 2025-12-04T15:04:54.6997222Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.6997266Z getattr(self, test_name)() 2025-12-04T15:04:54.6997425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.6997463Z fn() 2025-12-04T15:04:54.6997614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6997654Z method(*args, **kwargs) 2025-12-04T15:04:54.6997804Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.6997846Z method(*args, **kwargs) 2025-12-04T15:04:54.6997994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.6998032Z with policy(): 2025-12-04T15:04:54.6998184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.6998226Z raise RuntimeError(msg) 2025-12-04T15:04:54.6998586Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.6998589Z 2025-12-04T15:04:54.6998663Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.6998901Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.6998903Z 2025-12-04T15:04:54.6998988Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.6999053Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.6999114Z ======================= 1 failed, 4 deselected in 33.31s ======================= 2025-12-04T15:04:54.6999165Z Got exit code 1 2025-12-04T15:04:54.6999205Z Retrying single test... 2025-12-04T15:04:54.6999404Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-894c1b9d575c996c.xml 2025-12-04T15:04:54.6999461Z ============================= test session starts ============================== 2025-12-04T15:04:54.6999573Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.6999613Z cachedir: .pytest_cache 2025-12-04T15:04:54.6999769Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.6999814Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.6999864Z configfile: pytest.ini 2025-12-04T15:04:54.7000025Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7000103Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7000369Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7000433Z Running 1 items in this shard 2025-12-04T15:04:54.7000435Z 2025-12-04T15:04:54.7000749Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 14:42:06.543000 418866 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 418935 2025-12-04T15:04:54.7000902Z I1204 14:42:06.544000 418866 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 418936 2025-12-04T15:04:54.7001053Z I1204 14:42:06.545000 418866 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 418937 2025-12-04T15:04:54.7001200Z I1204 14:42:06.545000 418866 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 418938 2025-12-04T15:04:54.7001776Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7001813Z _warn_cpu_init() 2025-12-04T15:04:54.7002374Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7002412Z _warn_cpu_init() 2025-12-04T15:04:54.7002974Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7003012Z _warn_cpu_init() 2025-12-04T15:04:54.7003571Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7003633Z _warn_cpu_init() 2025-12-04T15:04:54.7003921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7003964Z return func(*args, **kwargs) 2025-12-04T15:04:54.7004107Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7004281Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7004567Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7004722Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7005013Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7005137Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7005414Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7005561Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7005839Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7005986Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7006258Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7006395Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7006669Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7006818Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7007300Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7007416Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7007613Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7007986Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7008109Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7008318Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7008490Z [rank2]:E1204 14:42:37.803000 418937 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7008529Z dist init r=2, world=4 2025-12-04T15:04:54.7008669Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7008828Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7009123Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7009276Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7009560Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7009686Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7009961Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7010110Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7010414Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7010559Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7010832Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7010970Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7011247Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7011395Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7011876Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.7012009Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7012221Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7012587Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7012701Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7012922Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7013084Z [rank1]:E1204 14:42:37.806000 418936 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7013126Z dist init r=1, world=4 2025-12-04T15:04:54.7013276Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7013436Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7013721Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7013877Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7014159Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7014287Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7014563Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7014712Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7014985Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7015130Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7015407Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7015543Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7015820Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7015968Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7016448Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.7016582Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7016777Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7017150Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7017262Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7017484Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7017647Z [rank0]:E1204 14:42:37.813000 418935 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7017688Z dist init r=0, world=4 2025-12-04T15:04:54.7017824Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7017985Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7018270Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7018425Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7018710Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7018833Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7019110Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7019257Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7019533Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7019680Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7019953Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7020090Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7020402Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7020575Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7021054Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.7021170Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7021377Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7021757Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7021872Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7022081Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7022243Z [rank3]:E1204 14:42:37.855000 418938 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7022281Z dist init r=3, world=4 2025-12-04T15:04:54.7022616Z [rank0]:[W1204 14:42:38.584154763 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7022656Z FAILED [33.2563s] [100%] 2025-12-04T15:04:54.7022658Z 2025-12-04T15:04:54.7022715Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7022821Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _ 2025-12-04T15:04:54.7022868Z Traceback (most recent call last): 2025-12-04T15:04:54.7023029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7023075Z self._join_processes(fn) 2025-12-04T15:04:54.7023248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7023302Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7023478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7023523Z raise RuntimeError(error) 2025-12-04T15:04:54.7023602Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7023648Z Traceback (most recent call last): 2025-12-04T15:04:54.7023808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7023853Z getattr(self, test_name)() 2025-12-04T15:04:54.7024010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7024046Z fn() 2025-12-04T15:04:54.7024197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7024239Z method(*args, **kwargs) 2025-12-04T15:04:54.7024388Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7024447Z method(*args, **kwargs) 2025-12-04T15:04:54.7024595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7024634Z with policy(): 2025-12-04T15:04:54.7024784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7024824Z raise RuntimeError(msg) 2025-12-04T15:04:54.7025190Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7025192Z 2025-12-04T15:04:54.7025266Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7025507Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7025510Z 2025-12-04T15:04:54.7026644Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7026647Z 2025-12-04T15:04:54.7026649Z 2025-12-04T15:04:54.7026726Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7026812Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7027045Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-894c1b9d575c996c.xml - 2025-12-04T15:04:54.7027106Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7027362Z FAILED [33.2563s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7027411Z Traceback (most recent call last): 2025-12-04T15:04:54.7027575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7027618Z getattr(self, test_name)() 2025-12-04T15:04:54.7027777Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7027812Z fn() 2025-12-04T15:04:54.7027960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7028003Z method(*args, **kwargs) 2025-12-04T15:04:54.7028153Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7028194Z method(*args, **kwargs) 2025-12-04T15:04:54.7028342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7028380Z with policy(): 2025-12-04T15:04:54.7028531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7028573Z raise RuntimeError(msg) 2025-12-04T15:04:54.7028929Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7028931Z 2025-12-04T15:04:54.7029005Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7029245Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7029258Z 2025-12-04T15:04:54.7029359Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7029422Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7029487Z ====================== 1 failed, 26 deselected in 33.42s ======================= 2025-12-04T15:04:54.7029525Z Got exit code 1 2025-12-04T15:04:54.7029565Z Retrying single test... 2025-12-04T15:04:54.7029752Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e50a88a971b93d0e.xml 2025-12-04T15:04:54.7029811Z ============================= test session starts ============================== 2025-12-04T15:04:54.7029934Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7029975Z cachedir: .pytest_cache 2025-12-04T15:04:54.7030136Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7030225Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7030268Z configfile: pytest.ini 2025-12-04T15:04:54.7030454Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7030529Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7030759Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7030802Z Running 1 items in this shard 2025-12-04T15:04:54.7030804Z 2025-12-04T15:04:54.7031117Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 14:42:42.358000 419268 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 419337 2025-12-04T15:04:54.7031271Z I1204 14:42:42.359000 419268 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 419338 2025-12-04T15:04:54.7031424Z I1204 14:42:42.359000 419268 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 419339 2025-12-04T15:04:54.7031574Z I1204 14:42:42.360000 419268 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 419340 2025-12-04T15:04:54.7032147Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7032185Z _warn_cpu_init() 2025-12-04T15:04:54.7032747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7032785Z _warn_cpu_init() 2025-12-04T15:04:54.7033344Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7033406Z _warn_cpu_init() 2025-12-04T15:04:54.7033970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7034007Z _warn_cpu_init() 2025-12-04T15:04:54.7034308Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7034354Z return func(*args, **kwargs) 2025-12-04T15:04:54.7034496Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7034659Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7034955Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7035109Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7035391Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7035516Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7035796Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7035942Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7036218Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7036365Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7036639Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7036775Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7037052Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7037199Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7037680Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7037814Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7038007Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7038372Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7038496Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7038710Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7038874Z [rank2]:E1204 14:43:13.779000 419339 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7038915Z dist init r=2, world=4 2025-12-04T15:04:54.7039062Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7039219Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7039507Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7039658Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7039946Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7040072Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7040380Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7040526Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7040803Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7040952Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7041228Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7041363Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7041639Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7041787Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7042278Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.7042404Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7042598Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7042979Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7043094Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7043323Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7043486Z [rank0]:E1204 14:43:13.811000 419337 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7043525Z dist init r=0, world=4 2025-12-04T15:04:54.7043665Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7043822Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7044109Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7044264Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7044549Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7044675Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7044950Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7045100Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7045376Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7045523Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7045799Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7045936Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7046211Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7046384Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7046864Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.7046990Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7047185Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7047562Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7047675Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7047888Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7048052Z [rank3]:E1204 14:43:13.834000 419340 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7048091Z dist init r=3, world=4 2025-12-04T15:04:54.7048227Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7048388Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7048671Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7048822Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7049104Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7049226Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7049500Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7049649Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7049924Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7050070Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7050386Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7050552Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7050830Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7050977Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7051470Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.7051586Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7051791Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7052154Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7052267Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7052479Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7052645Z [rank1]:E1204 14:43:13.841000 419338 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7052686Z dist init r=1, world=4 2025-12-04T15:04:54.7053019Z [rank0]:[W1204 14:43:14.603814497 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7053059Z FAILED [33.2529s] [100%] 2025-12-04T15:04:54.7053062Z 2025-12-04T15:04:54.7053117Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7053223Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _ 2025-12-04T15:04:54.7053269Z Traceback (most recent call last): 2025-12-04T15:04:54.7053429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7053476Z self._join_processes(fn) 2025-12-04T15:04:54.7053647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7053703Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7053880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7053924Z raise RuntimeError(error) 2025-12-04T15:04:54.7054003Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7054047Z Traceback (most recent call last): 2025-12-04T15:04:54.7054208Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7054253Z getattr(self, test_name)() 2025-12-04T15:04:54.7054410Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7054475Z fn() 2025-12-04T15:04:54.7054625Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7054667Z method(*args, **kwargs) 2025-12-04T15:04:54.7054816Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7054859Z method(*args, **kwargs) 2025-12-04T15:04:54.7055007Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7055044Z with policy(): 2025-12-04T15:04:54.7055204Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7055246Z raise RuntimeError(msg) 2025-12-04T15:04:54.7055602Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7055605Z 2025-12-04T15:04:54.7055693Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7055930Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7055934Z 2025-12-04T15:04:54.7056022Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7056024Z 2025-12-04T15:04:54.7056026Z 2025-12-04T15:04:54.7056104Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7056192Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7056426Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e50a88a971b93d0e.xml - 2025-12-04T15:04:54.7056488Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7056743Z FAILED [33.2529s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7056788Z Traceback (most recent call last): 2025-12-04T15:04:54.7056951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7056994Z getattr(self, test_name)() 2025-12-04T15:04:54.7057153Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7057188Z fn() 2025-12-04T15:04:54.7057339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7057381Z method(*args, **kwargs) 2025-12-04T15:04:54.7057530Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7057569Z method(*args, **kwargs) 2025-12-04T15:04:54.7057717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7057754Z with policy(): 2025-12-04T15:04:54.7057906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7057946Z raise RuntimeError(msg) 2025-12-04T15:04:54.7058305Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7058334Z 2025-12-04T15:04:54.7058410Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7058648Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7058650Z 2025-12-04T15:04:54.7058736Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7058800Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7058864Z ====================== 1 failed, 26 deselected in 33.41s ======================= 2025-12-04T15:04:54.7058910Z Got exit code 1 2025-12-04T15:04:54.7059099Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7059228Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7059428Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e3806273f7b0cbb.xml 2025-12-04T15:04:54.7059487Z ============================= test session starts ============================== 2025-12-04T15:04:54.7059601Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7059642Z cachedir: .pytest_cache 2025-12-04T15:04:54.7059798Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7059842Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7059883Z configfile: pytest.ini 2025-12-04T15:04:54.7060042Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7060118Z collecting ... collected 60 items / 5 deselected / 55 selected 2025-12-04T15:04:54.7060212Z stepcurrent: skipping 5 already run items. 2025-12-04T15:04:54.7060259Z Running 22 items in this shard 2025-12-04T15:04:54.7060261Z 2025-12-04T15:04:54.7060560Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda I1204 14:43:18.322000 419670 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 419739 2025-12-04T15:04:54.7060715Z I1204 14:43:18.323000 419670 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 419740 2025-12-04T15:04:54.7060866Z I1204 14:43:18.323000 419670 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 419741 2025-12-04T15:04:54.7061015Z I1204 14:43:18.324000 419670 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 419742 2025-12-04T15:04:54.7061592Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7061630Z _warn_cpu_init() 2025-12-04T15:04:54.7062196Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7062252Z _warn_cpu_init() 2025-12-04T15:04:54.7062824Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7062861Z _warn_cpu_init() 2025-12-04T15:04:54.7063430Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7063469Z _warn_cpu_init() 2025-12-04T15:04:54.7063771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7063818Z return func(*args, **kwargs) 2025-12-04T15:04:54.7063959Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7064122Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7064411Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7064566Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7064851Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7064974Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7065253Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7065399Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7065677Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7065824Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7066100Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7066236Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7066517Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7066675Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7067157Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.7067277Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7076520Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7076929Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7077078Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7077294Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7077464Z [rank0]:E1204 14:43:55.964000 419739 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7077507Z dist init r=0, world=4 2025-12-04T15:04:54.7077777Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7077941Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7078243Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7078400Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7078688Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7078817Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7079095Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7079248Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7079527Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7079676Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7079952Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7080091Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7080462Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7080611Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7081102Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.7081220Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7081420Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7081789Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7081907Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7082120Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7082290Z [rank1]:E1204 14:43:55.975000 419740 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7082334Z dist init r=1, world=4 2025-12-04T15:04:54.7082473Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7082635Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7082927Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7083082Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7083366Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7083492Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7083777Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7083929Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7084208Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7084355Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7084646Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7084799Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7085083Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7085233Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7085718Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.7085850Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7086047Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7086402Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7086515Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7086729Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7086896Z [rank3]:E1204 14:43:55.976000 419742 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7086939Z dist init r=3, world=4 2025-12-04T15:04:54.7087078Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7087240Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7087532Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7087686Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7087975Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7088099Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7088377Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7088525Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7088805Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7088974Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7089251Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7089387Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7089673Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7089823Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7090333Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.7090453Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7090648Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7091003Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7091119Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7091330Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7091495Z [rank2]:E1204 14:43:56.018000 419741 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7091534Z dist init r=2, world=4 2025-12-04T15:04:54.7091876Z [rank0]:[W1204 14:43:56.675790424 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7091920Z FAILED [39.5640s] [ 4%] 2025-12-04T15:04:54.7091923Z 2025-12-04T15:04:54.7091986Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7092090Z _____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda _____ 2025-12-04T15:04:54.7092141Z Traceback (most recent call last): 2025-12-04T15:04:54.7092309Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7092357Z self._join_processes(fn) 2025-12-04T15:04:54.7092530Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7092588Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7092766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7092812Z raise RuntimeError(error) 2025-12-04T15:04:54.7092914Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7092977Z Traceback (most recent call last): 2025-12-04T15:04:54.7093140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7093185Z getattr(self, test_name)() 2025-12-04T15:04:54.7093345Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7093382Z fn() 2025-12-04T15:04:54.7093534Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7114158Z method(*args, **kwargs) 2025-12-04T15:04:54.7114351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7114393Z method(*args, **kwargs) 2025-12-04T15:04:54.7114544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7114583Z with policy(): 2025-12-04T15:04:54.7114751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7114793Z raise RuntimeError(msg) 2025-12-04T15:04:54.7115146Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.7115148Z 2025-12-04T15:04:54.7115225Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7115494Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7115497Z 2025-12-04T15:04:54.7115584Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7115587Z 2025-12-04T15:04:54.7115589Z 2025-12-04T15:04:54.7115668Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7115754Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7115989Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e3806273f7b0cbb.xml - 2025-12-04T15:04:54.7116051Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7116304Z FAILED [39.5640s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7116351Z Traceback (most recent call last): 2025-12-04T15:04:54.7116516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7116560Z getattr(self, test_name)() 2025-12-04T15:04:54.7116719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7116755Z fn() 2025-12-04T15:04:54.7116904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7116945Z method(*args, **kwargs) 2025-12-04T15:04:54.7117095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7117135Z method(*args, **kwargs) 2025-12-04T15:04:54.7117286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7117323Z with policy(): 2025-12-04T15:04:54.7117495Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7117567Z raise RuntimeError(msg) 2025-12-04T15:04:54.7117922Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.7117925Z 2025-12-04T15:04:54.7118001Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7118239Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7118242Z 2025-12-04T15:04:54.7118329Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7118394Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7118457Z ======================= 1 failed, 5 deselected in 39.72s ======================= 2025-12-04T15:04:54.7118495Z Got exit code 1 2025-12-04T15:04:54.7118535Z Retrying single test... 2025-12-04T15:04:54.7118735Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a2d205e7935ab739.xml 2025-12-04T15:04:54.7118793Z ============================= test session starts ============================== 2025-12-04T15:04:54.7118908Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7118949Z cachedir: .pytest_cache 2025-12-04T15:04:54.7119108Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7119153Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7119194Z configfile: pytest.ini 2025-12-04T15:04:54.7119358Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7119433Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7119650Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7119695Z Running 1 items in this shard 2025-12-04T15:04:54.7119697Z 2025-12-04T15:04:54.7120009Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda I1204 14:44:00.405000 420072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 420141 2025-12-04T15:04:54.7120223Z I1204 14:44:00.406000 420072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 420142 2025-12-04T15:04:54.7120376Z I1204 14:44:00.407000 420072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 420143 2025-12-04T15:04:54.7120528Z I1204 14:44:00.407000 420072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 420144 2025-12-04T15:04:54.7121112Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7121149Z _warn_cpu_init() 2025-12-04T15:04:54.7121715Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7121784Z _warn_cpu_init() 2025-12-04T15:04:54.7122360Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7122397Z _warn_cpu_init() 2025-12-04T15:04:54.7122967Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7123007Z _warn_cpu_init() 2025-12-04T15:04:54.7123297Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7123341Z return func(*args, **kwargs) 2025-12-04T15:04:54.7123485Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7123648Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7123937Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7124094Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7124380Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7124505Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7124786Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7124934Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7125212Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7125358Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7125635Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7125770Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7126059Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7126219Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7126701Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.7126818Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7127016Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7127397Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7127511Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7127722Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7127886Z [rank1]:E1204 14:44:38.031000 420142 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7127927Z dist init r=1, world=4 2025-12-04T15:04:54.7128064Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7128222Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7128508Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7128660Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7128945Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7129074Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7129355Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7129501Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7129777Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7129924Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7130236Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7130402Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7130677Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7130826Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7131311Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.7131429Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7131635Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7131990Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7132104Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7132314Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7132481Z [rank3]:E1204 14:44:38.035000 420144 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7132520Z dist init r=3, world=4 2025-12-04T15:04:54.7132658Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7132815Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7133103Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7133256Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7133541Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7133667Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7133943Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7134091Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7134365Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7134532Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7134807Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7134943Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7135236Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7135386Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7135874Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.7135989Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7136185Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7136540Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7136655Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7136864Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7137028Z [rank2]:E1204 14:44:38.070000 420143 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7137068Z dist init r=2, world=4 2025-12-04T15:04:54.7137204Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7137365Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7137650Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7137806Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7138089Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7138214Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7138489Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7138650Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7138938Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7139085Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7139363Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7139508Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7139786Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7139943Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7140465Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.7140581Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7140776Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7141134Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7141245Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7141457Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7141622Z [rank0]:E1204 14:44:38.086000 420141 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7141662Z dist init r=0, world=4 2025-12-04T15:04:54.7142002Z [rank0]:[W1204 14:44:38.868888876 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7142047Z FAILED [39.4611s] [100%] 2025-12-04T15:04:54.7142049Z 2025-12-04T15:04:54.7142108Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7142209Z _____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda _____ 2025-12-04T15:04:54.7142255Z Traceback (most recent call last): 2025-12-04T15:04:54.7142420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7142464Z self._join_processes(fn) 2025-12-04T15:04:54.7142639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7142710Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7142887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7142949Z raise RuntimeError(error) 2025-12-04T15:04:54.7143031Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7143077Z Traceback (most recent call last): 2025-12-04T15:04:54.7143238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7143281Z getattr(self, test_name)() 2025-12-04T15:04:54.7143439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7143476Z fn() 2025-12-04T15:04:54.7143650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7143694Z method(*args, **kwargs) 2025-12-04T15:04:54.7143843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7143886Z method(*args, **kwargs) 2025-12-04T15:04:54.7144048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7144088Z with policy(): 2025-12-04T15:04:54.7144238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7144280Z raise RuntimeError(msg) 2025-12-04T15:04:54.7144626Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.7144628Z 2025-12-04T15:04:54.7144706Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7144930Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7144933Z 2025-12-04T15:04:54.7145024Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7145026Z 2025-12-04T15:04:54.7145027Z 2025-12-04T15:04:54.7145105Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7145193Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7145428Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a2d205e7935ab739.xml - 2025-12-04T15:04:54.7145488Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7145735Z FAILED [39.4611s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7145783Z Traceback (most recent call last): 2025-12-04T15:04:54.7145948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7145990Z getattr(self, test_name)() 2025-12-04T15:04:54.7146149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7146185Z fn() 2025-12-04T15:04:54.7146335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7146376Z method(*args, **kwargs) 2025-12-04T15:04:54.7146526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7146579Z method(*args, **kwargs) 2025-12-04T15:04:54.7146730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7146779Z with policy(): 2025-12-04T15:04:54.7146932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7146973Z raise RuntimeError(msg) 2025-12-04T15:04:54.7147318Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.7147320Z 2025-12-04T15:04:54.7147404Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7147631Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7147635Z 2025-12-04T15:04:54.7147724Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7147797Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7147861Z ====================== 1 failed, 26 deselected in 39.62s ======================= 2025-12-04T15:04:54.7147899Z Got exit code 1 2025-12-04T15:04:54.7147941Z Retrying single test... 2025-12-04T15:04:54.7148128Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dc522191463327b7.xml 2025-12-04T15:04:54.7148187Z ============================= test session starts ============================== 2025-12-04T15:04:54.7148299Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7148342Z cachedir: .pytest_cache 2025-12-04T15:04:54.7148501Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7148549Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7148590Z configfile: pytest.ini 2025-12-04T15:04:54.7148755Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7148829Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7149049Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7149093Z Running 1 items in this shard 2025-12-04T15:04:54.7149095Z 2025-12-04T15:04:54.7149395Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda I1204 14:44:42.449000 420474 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 420543 2025-12-04T15:04:54.7149549Z I1204 14:44:42.450000 420474 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 420544 2025-12-04T15:04:54.7149703Z I1204 14:44:42.451000 420474 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 420545 2025-12-04T15:04:54.7149852Z I1204 14:44:42.451000 420474 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 420546 2025-12-04T15:04:54.7150477Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7150530Z _warn_cpu_init() 2025-12-04T15:04:54.7151093Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7151145Z _warn_cpu_init() 2025-12-04T15:04:54.7151717Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7151759Z _warn_cpu_init() 2025-12-04T15:04:54.7152335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7152373Z _warn_cpu_init() 2025-12-04T15:04:54.7152668Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7152711Z return func(*args, **kwargs) 2025-12-04T15:04:54.7152856Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7153017Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7153307Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7153464Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7153746Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7153872Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7154152Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7154301Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7154577Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7154726Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7155005Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7155164Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7155443Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7155589Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7156073Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.7156189Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7156396Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7156748Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7156864Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7157074Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7157238Z [rank1]:E1204 14:45:20.028000 420544 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7157280Z dist init r=1, world=4 2025-12-04T15:04:54.7157419Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7157579Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7157866Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7158019Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7158303Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7158429Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7158704Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7158852Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7159130Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7159287Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7159584Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7159720Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7160007Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7160153Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7160684Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.7160802Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7160996Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7161349Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7161462Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7161676Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7161841Z [rank2]:E1204 14:45:20.038000 420545 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7161881Z dist init r=2, world=4 2025-12-04T15:04:54.7162019Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7162181Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7162467Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7162621Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7162907Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7163029Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7163305Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7163451Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7163754Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7163899Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7164174Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7164324Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7164601Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7164750Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7165231Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.7165347Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7165540Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7165893Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7166008Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7166216Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7166380Z [rank0]:E1204 14:45:20.066000 420543 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7166418Z dist init r=0, world=4 2025-12-04T15:04:54.7166557Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7166718Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7167010Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7167162Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7167450Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7167573Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7167864Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7168023Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7168296Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7168442Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7168725Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7168865Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7169152Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7169300Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7169770Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.7169883Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7170080Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7170485Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7170599Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7170810Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7170975Z [rank3]:E1204 14:45:20.101000 420546 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7171015Z dist init r=3, world=4 2025-12-04T15:04:54.7171353Z [rank0]:[W1204 14:45:20.804871119 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7171396Z FAILED [39.5630s] [100%] 2025-12-04T15:04:54.7171398Z 2025-12-04T15:04:54.7171454Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7171554Z _____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda _____ 2025-12-04T15:04:54.7171601Z Traceback (most recent call last): 2025-12-04T15:04:54.7171764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7171831Z self._join_processes(fn) 2025-12-04T15:04:54.7172003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7172075Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7172253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7172299Z raise RuntimeError(error) 2025-12-04T15:04:54.7172381Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7172427Z Traceback (most recent call last): 2025-12-04T15:04:54.7172591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7172650Z getattr(self, test_name)() 2025-12-04T15:04:54.7172810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7172847Z fn() 2025-12-04T15:04:54.7173000Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7173043Z method(*args, **kwargs) 2025-12-04T15:04:54.7173210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7173253Z method(*args, **kwargs) 2025-12-04T15:04:54.7173404Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7173441Z with policy(): 2025-12-04T15:04:54.7173596Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7173638Z raise RuntimeError(msg) 2025-12-04T15:04:54.7173985Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.7173989Z 2025-12-04T15:04:54.7174065Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7174291Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7174294Z 2025-12-04T15:04:54.7174382Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7174384Z 2025-12-04T15:04:54.7174446Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7174491Z Traceback (most recent call last): 2025-12-04T15:04:54.7174657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7174701Z getattr(self, test_name)() 2025-12-04T15:04:54.7174862Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7174901Z fn() 2025-12-04T15:04:54.7175052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7175094Z method(*args, **kwargs) 2025-12-04T15:04:54.7175243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7175285Z method(*args, **kwargs) 2025-12-04T15:04:54.7175434Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7175477Z with policy(): 2025-12-04T15:04:54.7175628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7175692Z raise RuntimeError(msg) 2025-12-04T15:04:54.7176037Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.7176051Z 2025-12-04T15:04:54.7176128Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7176351Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7176354Z 2025-12-04T15:04:54.7176442Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7176455Z 2025-12-04T15:04:54.7176457Z 2025-12-04T15:04:54.7176533Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7176623Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7176853Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dc522191463327b7.xml - 2025-12-04T15:04:54.7176928Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7177171Z FAILED [39.5630s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7177218Z Traceback (most recent call last): 2025-12-04T15:04:54.7177382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7177426Z getattr(self, test_name)() 2025-12-04T15:04:54.7177587Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7177625Z fn() 2025-12-04T15:04:54.7177777Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7177818Z method(*args, **kwargs) 2025-12-04T15:04:54.7177970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7178010Z method(*args, **kwargs) 2025-12-04T15:04:54.7178160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7178198Z with policy(): 2025-12-04T15:04:54.7178351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7178393Z raise RuntimeError(msg) 2025-12-04T15:04:54.7178739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.7178742Z 2025-12-04T15:04:54.7178816Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7179044Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7179046Z 2025-12-04T15:04:54.7179132Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7179136Z 2025-12-04T15:04:54.7179193Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7179242Z Traceback (most recent call last): 2025-12-04T15:04:54.7179407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7179451Z getattr(self, test_name)() 2025-12-04T15:04:54.7179623Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7179674Z fn() 2025-12-04T15:04:54.7179823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7179866Z method(*args, **kwargs) 2025-12-04T15:04:54.7180016Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7180058Z method(*args, **kwargs) 2025-12-04T15:04:54.7180244Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7180284Z with policy(): 2025-12-04T15:04:54.7180450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7180496Z raise RuntimeError(msg) 2025-12-04T15:04:54.7180852Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.7180856Z 2025-12-04T15:04:54.7180933Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7181156Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7181158Z 2025-12-04T15:04:54.7181246Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7181310Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7181377Z ====================== 1 failed, 26 deselected in 39.72s ======================= 2025-12-04T15:04:54.7181415Z Got exit code 1 2025-12-04T15:04:54.7181594Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda 2025-12-04T15:04:54.7181724Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7181914Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-11addf6822a90204.xml 2025-12-04T15:04:54.7181976Z ============================= test session starts ============================== 2025-12-04T15:04:54.7182089Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7182133Z cachedir: .pytest_cache 2025-12-04T15:04:54.7182291Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7182339Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7182383Z configfile: pytest.ini 2025-12-04T15:04:54.7182549Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7182623Z collecting ... collected 60 items / 6 deselected / 54 selected 2025-12-04T15:04:54.7182680Z stepcurrent: skipping 6 already run items. 2025-12-04T15:04:54.7182724Z Running 21 items in this shard 2025-12-04T15:04:54.7182726Z 2025-12-04T15:04:54.7183034Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 14:45:24.630000 420876 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 420945 2025-12-04T15:04:54.7183190Z I1204 14:45:24.630000 420876 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 420946 2025-12-04T15:04:54.7183344Z I1204 14:45:24.631000 420876 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 420947 2025-12-04T15:04:54.7183507Z I1204 14:45:24.631000 420876 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 420948 2025-12-04T15:04:54.7184103Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7184145Z _warn_cpu_init() 2025-12-04T15:04:54.7184454Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7184495Z _init_core_state( 2025-12-04T15:04:54.7184999Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7185066Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7185634Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7185675Z _warn_cpu_init() 2025-12-04T15:04:54.7185971Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7186011Z _init_core_state( 2025-12-04T15:04:54.7186503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7186564Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7187133Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7187172Z _warn_cpu_init() 2025-12-04T15:04:54.7187465Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7187503Z _init_core_state( 2025-12-04T15:04:54.7187995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7188079Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7188646Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7188687Z _warn_cpu_init() 2025-12-04T15:04:54.7189186Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7189248Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7189752Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7189810Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7190346Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7190405Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7190704Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7190743Z _init_core_state( 2025-12-04T15:04:54.7191230Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7191290Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7192567Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7192712Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7194008Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7194134Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7195401Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7195524Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7196775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7196898Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7197126Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7197171Z return func(*args, **kwargs) 2025-12-04T15:04:54.7197396Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7197451Z return func(*args, **kwargs) 2025-12-04T15:04:54.7197685Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7197727Z return func(*args, **kwargs) 2025-12-04T15:04:54.7197952Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7197996Z return func(*args, **kwargs) 2025-12-04T15:04:54.7198216Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7198269Z return func(*args, **kwargs) 2025-12-04T15:04:54.7198487Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7198531Z return func(*args, **kwargs) 2025-12-04T15:04:54.7198759Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7198803Z return func(*args, **kwargs) 2025-12-04T15:04:54.7199019Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7199061Z return func(*args, **kwargs) 2025-12-04T15:04:54.7199354Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7199394Z return func(*args, **kwargs) 2025-12-04T15:04:54.7199540Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7199702Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7199994Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7200148Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7200480Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7200605Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7200885Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7201033Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7201312Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7201464Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7201741Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7201908Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7202186Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7202336Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7202829Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7202952Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7203164Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7203520Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7203639Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7203850Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7204018Z [rank1]:E1204 14:45:33.318000 420946 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7204059Z dist init r=1, world=4 2025-12-04T15:04:54.7204200Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7204359Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7204648Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7204803Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7205093Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7205222Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7205498Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7205646Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7205922Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7206082Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7206369Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7206507Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7206806Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7206953Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7207445Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7207561Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7207758Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7208112Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7208229Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7208443Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7208606Z [rank2]:E1204 14:45:33.324000 420947 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7208648Z dist init r=2, world=4 2025-12-04T15:04:54.7208785Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7208946Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7209232Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7209390Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7209778Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7209905Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7210236Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7210386Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7210695Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7210841Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7211116Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7211267Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7211547Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7211708Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7212187Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7212304Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7212499Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7212856Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7212969Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7213182Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7213346Z [rank3]:E1204 14:45:33.333000 420948 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7213389Z dist init r=3, world=4 2025-12-04T15:04:54.7213526Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7213690Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7213978Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7214133Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7214424Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7214548Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7214837Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7214995Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7215273Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7215419Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7215758Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7215897Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7216185Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7216335Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7216810Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7216927Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7217123Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7217483Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7217599Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7217809Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7217975Z [rank0]:E1204 14:45:33.339000 420945 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7218016Z dist init r=0, world=4 2025-12-04T15:04:54.7218356Z [rank2]:[W1204 14:45:33.023651832 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7218684Z [rank1]:[W1204 14:45:33.034638055 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7219015Z [rank3]:[W1204 14:45:33.112017672 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7219373Z [rank0]:[W1204 14:45:33.119304825 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7219414Z FAILED [23.0381s] [ 4%] 2025-12-04T15:04:54.7219416Z 2025-12-04T15:04:54.7219475Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7219575Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____ 2025-12-04T15:04:54.7219624Z Traceback (most recent call last): 2025-12-04T15:04:54.7219799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7219846Z self._join_processes(fn) 2025-12-04T15:04:54.7220019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7220076Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7220321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7220368Z raise RuntimeError(error) 2025-12-04T15:04:54.7220447Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7220495Z Traceback (most recent call last): 2025-12-04T15:04:54.7220655Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7220701Z getattr(self, test_name)() 2025-12-04T15:04:54.7220859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7220897Z fn() 2025-12-04T15:04:54.7221049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7221094Z method(*args, **kwargs) 2025-12-04T15:04:54.7221244Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7221287Z method(*args, **kwargs) 2025-12-04T15:04:54.7221437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7221476Z with policy(): 2025-12-04T15:04:54.7221628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7221673Z raise RuntimeError(msg) 2025-12-04T15:04:54.7222023Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7222029Z 2025-12-04T15:04:54.7222103Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7222333Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7222335Z 2025-12-04T15:04:54.7222423Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7222424Z 2025-12-04T15:04:54.7222426Z 2025-12-04T15:04:54.7222504Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7222592Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7222828Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-11addf6822a90204.xml - 2025-12-04T15:04:54.7222904Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7223176Z FAILED [23.0381s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7223223Z Traceback (most recent call last): 2025-12-04T15:04:54.7223387Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7223429Z getattr(self, test_name)() 2025-12-04T15:04:54.7223591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7223626Z fn() 2025-12-04T15:04:54.7223793Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7223835Z method(*args, **kwargs) 2025-12-04T15:04:54.7223989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7224030Z method(*args, **kwargs) 2025-12-04T15:04:54.7224193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7224231Z with policy(): 2025-12-04T15:04:54.7224384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7224425Z raise RuntimeError(msg) 2025-12-04T15:04:54.7224780Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7224783Z 2025-12-04T15:04:54.7224857Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7225084Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7225087Z 2025-12-04T15:04:54.7225177Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7225241Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7225305Z ======================= 1 failed, 6 deselected in 23.20s ======================= 2025-12-04T15:04:54.7225343Z Got exit code 1 2025-12-04T15:04:54.7225386Z Retrying single test... 2025-12-04T15:04:54.7225575Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2f95f0278319ef02.xml 2025-12-04T15:04:54.7225632Z ============================= test session starts ============================== 2025-12-04T15:04:54.7225746Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7225792Z cachedir: .pytest_cache 2025-12-04T15:04:54.7225951Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7226000Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7226041Z configfile: pytest.ini 2025-12-04T15:04:54.7226204Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7226279Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7226500Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7226545Z Running 1 items in this shard 2025-12-04T15:04:54.7226547Z 2025-12-04T15:04:54.7226851Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 14:45:50.194000 422142 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 422211 2025-12-04T15:04:54.7227029Z I1204 14:45:50.194000 422142 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 422212 2025-12-04T15:04:54.7227180Z I1204 14:45:50.195000 422142 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 422213 2025-12-04T15:04:54.7227331Z I1204 14:45:50.196000 422142 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 422214 2025-12-04T15:04:54.7227920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7227962Z _warn_cpu_init() 2025-12-04T15:04:54.7228272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7228312Z _init_core_state( 2025-12-04T15:04:54.7228810Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7228871Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7229442Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7229481Z _warn_cpu_init() 2025-12-04T15:04:54.7229776Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7229812Z _init_core_state( 2025-12-04T15:04:54.7230342Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7230408Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7230978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7231019Z _warn_cpu_init() 2025-12-04T15:04:54.7231310Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7231378Z _init_core_state( 2025-12-04T15:04:54.7231863Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7231924Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7232508Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7232549Z _warn_cpu_init() 2025-12-04T15:04:54.7233053Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7233112Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7233603Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7233664Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7233957Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7233999Z _init_core_state( 2025-12-04T15:04:54.7234485Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7234545Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7235030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7235090Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7236362Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7236514Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7237787Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7237914Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7239170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7239292Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7240687Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7240812Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7241064Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7241127Z return func(*args, **kwargs) 2025-12-04T15:04:54.7241351Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7241398Z return func(*args, **kwargs) 2025-12-04T15:04:54.7241621Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7241676Z return func(*args, **kwargs) 2025-12-04T15:04:54.7241901Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7241943Z return func(*args, **kwargs) 2025-12-04T15:04:54.7242166Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7242220Z return func(*args, **kwargs) 2025-12-04T15:04:54.7242445Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7242486Z return func(*args, **kwargs) 2025-12-04T15:04:54.7242706Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7242747Z return func(*args, **kwargs) 2025-12-04T15:04:54.7242969Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7243012Z return func(*args, **kwargs) 2025-12-04T15:04:54.7243308Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7243349Z return func(*args, **kwargs) 2025-12-04T15:04:54.7243496Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7243660Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7243958Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7244116Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7244404Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7244531Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7244811Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7244964Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7247832Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7248005Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7248281Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7248423Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7248715Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7248888Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7249385Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7249502Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7249702Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7250058Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7250254Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7250466Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7250634Z [rank0]:E1204 14:45:58.716000 422211 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7250674Z dist init r=0, world=4 2025-12-04T15:04:54.7250820Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7250980Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7251277Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7251434Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7251719Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7251848Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7252123Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7252362Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7252639Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7252789Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7253084Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7253223Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7253519Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7253669Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7254147Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7254263Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7254461Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7254820Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7254933Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7255149Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7255313Z [rank1]:E1204 14:45:58.731000 422212 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7255356Z dist init r=1, world=4 2025-12-04T15:04:54.7255495Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7255658Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7255944Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7256101Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7256385Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7256531Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7256823Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7256970Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7257269Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7257415Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7257693Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7257841Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7258122Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7258269Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7258743Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7258863Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7259057Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7259414Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7259527Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7259740Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7259906Z [rank3]:E1204 14:45:58.738000 422214 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7259947Z dist init r=3, world=4 2025-12-04T15:04:54.7260086Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7260294Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7260585Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7260739Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7261059Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7261184Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7261462Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7261623Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7261901Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7262052Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7262344Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7262484Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7262764Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7262915Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7263390Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7263503Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7263704Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7264057Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7264174Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7264385Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7264549Z [rank2]:E1204 14:45:58.765000 422213 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7264589Z dist init r=2, world=4 2025-12-04T15:04:54.7264928Z [rank0]:[W1204 14:45:58.442404851 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7265257Z [rank3]:[W1204 14:45:58.474359607 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7265609Z [rank1]:[W1204 14:45:58.490036688 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7265949Z [rank2]:[W1204 14:45:59.543593172 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7265991Z FAILED [22.7412s] [100%] 2025-12-04T15:04:54.7265993Z 2025-12-04T15:04:54.7266051Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7266152Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____ 2025-12-04T15:04:54.7266202Z Traceback (most recent call last): 2025-12-04T15:04:54.7266377Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7266424Z self._join_processes(fn) 2025-12-04T15:04:54.7266598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7266655Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7266834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7266878Z raise RuntimeError(error) 2025-12-04T15:04:54.7266957Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7267006Z Traceback (most recent call last): 2025-12-04T15:04:54.7267167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7267213Z getattr(self, test_name)() 2025-12-04T15:04:54.7267371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7267411Z fn() 2025-12-04T15:04:54.7267563Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7267607Z method(*args, **kwargs) 2025-12-04T15:04:54.7267757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7267801Z method(*args, **kwargs) 2025-12-04T15:04:54.7267950Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7267990Z with policy(): 2025-12-04T15:04:54.7268141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7268186Z raise RuntimeError(msg) 2025-12-04T15:04:54.7268537Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7268540Z 2025-12-04T15:04:54.7268616Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7268851Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7268853Z 2025-12-04T15:04:54.7268942Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7268956Z 2025-12-04T15:04:54.7268958Z 2025-12-04T15:04:54.7269051Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7269140Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7269378Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2f95f0278319ef02.xml - 2025-12-04T15:04:54.7269439Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7269691Z FAILED [22.7412s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7269755Z Traceback (most recent call last): 2025-12-04T15:04:54.7269925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7269972Z getattr(self, test_name)() 2025-12-04T15:04:54.7270135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7270227Z fn() 2025-12-04T15:04:54.7270394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7270439Z method(*args, **kwargs) 2025-12-04T15:04:54.7270590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7270635Z method(*args, **kwargs) 2025-12-04T15:04:54.7270784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7270826Z with policy(): 2025-12-04T15:04:54.7270978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7271025Z raise RuntimeError(msg) 2025-12-04T15:04:54.7271376Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7271378Z 2025-12-04T15:04:54.7271455Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7271684Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7271686Z 2025-12-04T15:04:54.7271778Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7271842Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7271908Z ====================== 1 failed, 26 deselected in 22.90s ======================= 2025-12-04T15:04:54.7271948Z Got exit code 1 2025-12-04T15:04:54.7271991Z Retrying single test... 2025-12-04T15:04:54.7272180Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ffaa507d9ac6884e.xml 2025-12-04T15:04:54.7272241Z ============================= test session starts ============================== 2025-12-04T15:04:54.7272354Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7272397Z cachedir: .pytest_cache 2025-12-04T15:04:54.7272556Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7272606Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7272648Z configfile: pytest.ini 2025-12-04T15:04:54.7272815Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7272913Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7273161Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7273207Z Running 1 items in this shard 2025-12-04T15:04:54.7273209Z 2025-12-04T15:04:54.7273513Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 14:46:15.615000 423408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 423477 2025-12-04T15:04:54.7273684Z I1204 14:46:15.616000 423408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 423478 2025-12-04T15:04:54.7273838Z I1204 14:46:15.616000 423408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 423479 2025-12-04T15:04:54.7273993Z I1204 14:46:15.617000 423408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 423480 2025-12-04T15:04:54.7274582Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7274624Z _warn_cpu_init() 2025-12-04T15:04:54.7274923Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7274964Z _init_core_state( 2025-12-04T15:04:54.7275458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7275521Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7276093Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7276131Z _warn_cpu_init() 2025-12-04T15:04:54.7276430Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7276470Z _init_core_state( 2025-12-04T15:04:54.7276960Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7277024Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7277590Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7277657Z _warn_cpu_init() 2025-12-04T15:04:54.7278229Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7278270Z _warn_cpu_init() 2025-12-04T15:04:54.7278565Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7278606Z _init_core_state( 2025-12-04T15:04:54.7279115Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7279175Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7279468Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7279506Z _init_core_state( 2025-12-04T15:04:54.7279994Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7280055Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7280576Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7280636Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7281120Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7281183Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7281665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7281725Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7283030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7283171Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7284445Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7284573Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7285832Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7285955Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7287216Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7287360Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7287590Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7287633Z return func(*args, **kwargs) 2025-12-04T15:04:54.7287871Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7287915Z return func(*args, **kwargs) 2025-12-04T15:04:54.7288141Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7288193Z return func(*args, **kwargs) 2025-12-04T15:04:54.7288419Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7288461Z return func(*args, **kwargs) 2025-12-04T15:04:54.7288685Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7288727Z return func(*args, **kwargs) 2025-12-04T15:04:54.7288948Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7288990Z return func(*args, **kwargs) 2025-12-04T15:04:54.7289215Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7289256Z return func(*args, **kwargs) 2025-12-04T15:04:54.7289477Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7289518Z return func(*args, **kwargs) 2025-12-04T15:04:54.7289813Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7289854Z return func(*args, **kwargs) 2025-12-04T15:04:54.7290001Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7290211Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7290507Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7290665Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7290953Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7291083Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7291381Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7291547Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7291823Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7291988Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7292266Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7292405Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7292701Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7292850Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7293334Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7293451Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7293650Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7294007Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7294121Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7294336Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7294500Z [rank2]:E1204 14:46:24.170000 423479 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7294543Z dist init r=2, world=4 2025-12-04T15:04:54.7294682Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7294845Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7295134Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7295293Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7295578Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7295730Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7296007Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7296154Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7296442Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7296591Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7296882Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7297018Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7297298Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7297449Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7297927Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7298044Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7298239Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7298596Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7298710Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7298925Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7299091Z [rank3]:E1204 14:46:24.179000 423480 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7299130Z dist init r=3, world=4 2025-12-04T15:04:54.7299269Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7299428Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7299719Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7299893Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7300268Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7300393Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7300683Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7300831Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7301124Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7301272Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7301547Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7301686Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7301962Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7302116Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7302594Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7302711Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7302908Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7303263Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7303379Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7303590Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7303760Z [rank0]:E1204 14:46:24.216000 423477 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7303800Z dist init r=0, world=4 2025-12-04T15:04:54.7303941Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7304118Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7304421Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7304575Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7304878Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7305006Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7305282Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7305452Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7305728Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7305878Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7306153Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7306292Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7306574Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7306721Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7307205Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7307320Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7307518Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7307872Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7307988Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7308201Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7308364Z [rank1]:E1204 14:46:24.232000 423478 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7308428Z dist init r=1, world=4 2025-12-04T15:04:54.7308763Z [rank2]:[W1204 14:46:24.848193762 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7309094Z [rank3]:[W1204 14:46:24.863657963 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7309432Z [rank0]:[W1204 14:46:24.986445746 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7309773Z [rank1]:[W1204 14:46:24.023494242 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7309816Z FAILED [22.9392s] [100%] 2025-12-04T15:04:54.7309821Z 2025-12-04T15:04:54.7309879Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7309982Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____ 2025-12-04T15:04:54.7310029Z Traceback (most recent call last): 2025-12-04T15:04:54.7310248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7310293Z self._join_processes(fn) 2025-12-04T15:04:54.7310469Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7310526Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7310708Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7310753Z raise RuntimeError(error) 2025-12-04T15:04:54.7310837Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7310882Z Traceback (most recent call last): 2025-12-04T15:04:54.7311045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7311086Z getattr(self, test_name)() 2025-12-04T15:04:54.7311249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7311284Z fn() 2025-12-04T15:04:54.7311439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7311483Z method(*args, **kwargs) 2025-12-04T15:04:54.7311638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7311682Z method(*args, **kwargs) 2025-12-04T15:04:54.7311835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7311872Z with policy(): 2025-12-04T15:04:54.7312027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7312069Z raise RuntimeError(msg) 2025-12-04T15:04:54.7312424Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7312445Z 2025-12-04T15:04:54.7312535Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7312766Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7312769Z 2025-12-04T15:04:54.7312860Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7312862Z 2025-12-04T15:04:54.7312922Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7312970Z Traceback (most recent call last): 2025-12-04T15:04:54.7313146Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7313193Z getattr(self, test_name)() 2025-12-04T15:04:54.7313352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7313393Z fn() 2025-12-04T15:04:54.7313544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7313589Z method(*args, **kwargs) 2025-12-04T15:04:54.7313757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7313800Z method(*args, **kwargs) 2025-12-04T15:04:54.7313950Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7313992Z with policy(): 2025-12-04T15:04:54.7314144Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7314189Z raise RuntimeError(msg) 2025-12-04T15:04:54.7314538Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7314542Z 2025-12-04T15:04:54.7314622Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7314849Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7314855Z 2025-12-04T15:04:54.7314943Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7314945Z 2025-12-04T15:04:54.7315008Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7315056Z Traceback (most recent call last): 2025-12-04T15:04:54.7315222Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7315266Z getattr(self, test_name)() 2025-12-04T15:04:54.7315427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7315464Z fn() 2025-12-04T15:04:54.7315618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7315659Z method(*args, **kwargs) 2025-12-04T15:04:54.7315812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7315852Z method(*args, **kwargs) 2025-12-04T15:04:54.7316005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7316044Z with policy(): 2025-12-04T15:04:54.7316198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7316252Z raise RuntimeError(msg) 2025-12-04T15:04:54.7316606Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7316620Z 2025-12-04T15:04:54.7316694Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7316924Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7316926Z 2025-12-04T15:04:54.7317026Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7317028Z 2025-12-04T15:04:54.7317033Z 2025-12-04T15:04:54.7317109Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7317202Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7317435Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ffaa507d9ac6884e.xml - 2025-12-04T15:04:54.7317511Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7317757Z FAILED [22.9392s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7317806Z Traceback (most recent call last): 2025-12-04T15:04:54.7317970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7318017Z getattr(self, test_name)() 2025-12-04T15:04:54.7318175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7318214Z fn() 2025-12-04T15:04:54.7318367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7318410Z method(*args, **kwargs) 2025-12-04T15:04:54.7318561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7318606Z method(*args, **kwargs) 2025-12-04T15:04:54.7318757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7318797Z with policy(): 2025-12-04T15:04:54.7318949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7318993Z raise RuntimeError(msg) 2025-12-04T15:04:54.7319343Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7319346Z 2025-12-04T15:04:54.7319423Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7319652Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7319654Z 2025-12-04T15:04:54.7319742Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7319744Z 2025-12-04T15:04:54.7319805Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7319852Z Traceback (most recent call last): 2025-12-04T15:04:54.7320017Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7320076Z getattr(self, test_name)() 2025-12-04T15:04:54.7320283Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7320335Z fn() 2025-12-04T15:04:54.7320488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7320526Z method(*args, **kwargs) 2025-12-04T15:04:54.7320679Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7320718Z method(*args, **kwargs) 2025-12-04T15:04:54.7320870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7320927Z with policy(): 2025-12-04T15:04:54.7321084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7321127Z raise RuntimeError(msg) 2025-12-04T15:04:54.7321490Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7321493Z 2025-12-04T15:04:54.7321568Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7321794Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7321797Z 2025-12-04T15:04:54.7321887Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7321889Z 2025-12-04T15:04:54.7321947Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7321995Z Traceback (most recent call last): 2025-12-04T15:04:54.7322157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7322203Z getattr(self, test_name)() 2025-12-04T15:04:54.7322362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7322398Z fn() 2025-12-04T15:04:54.7322548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7322591Z method(*args, **kwargs) 2025-12-04T15:04:54.7322741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7322785Z method(*args, **kwargs) 2025-12-04T15:04:54.7322937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7322976Z with policy(): 2025-12-04T15:04:54.7323128Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7323174Z raise RuntimeError(msg) 2025-12-04T15:04:54.7323522Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7323525Z 2025-12-04T15:04:54.7323602Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7323829Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7323831Z 2025-12-04T15:04:54.7323918Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7323997Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7324073Z ====================== 1 failed, 26 deselected in 23.10s ======================= 2025-12-04T15:04:54.7324111Z Got exit code 1 2025-12-04T15:04:54.7324290Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda 2025-12-04T15:04:54.7324421Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7324610Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d544cccfd778cd2d.xml 2025-12-04T15:04:54.7324670Z ============================= test session starts ============================== 2025-12-04T15:04:54.7324796Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7324841Z cachedir: .pytest_cache 2025-12-04T15:04:54.7324999Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7325050Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7325091Z configfile: pytest.ini 2025-12-04T15:04:54.7325268Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7325342Z collecting ... collected 60 items / 7 deselected / 53 selected 2025-12-04T15:04:54.7325398Z stepcurrent: skipping 7 already run items. 2025-12-04T15:04:54.7325442Z Running 20 items in this shard 2025-12-04T15:04:54.7325444Z 2025-12-04T15:04:54.7325763Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda I1204 14:46:41.029000 424674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 424743 2025-12-04T15:04:54.7325919Z I1204 14:46:41.030000 424674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 424744 2025-12-04T15:04:54.7326077Z I1204 14:46:41.030000 424674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 424745 2025-12-04T15:04:54.7326228Z I1204 14:46:41.031000 424674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 424746 2025-12-04T15:04:54.7326807Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7326849Z _warn_cpu_init() 2025-12-04T15:04:54.7327150Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7327194Z _init_core_state( 2025-12-04T15:04:54.7327689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7327755Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7328321Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7328387Z _warn_cpu_init() 2025-12-04T15:04:54.7328690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7328728Z _init_core_state( 2025-12-04T15:04:54.7329228Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7329290Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7329867Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7329906Z _warn_cpu_init() 2025-12-04T15:04:54.7330248Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7330290Z _init_core_state( 2025-12-04T15:04:54.7330776Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7330840Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7331405Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7331445Z _warn_cpu_init() 2025-12-04T15:04:54.7331931Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7331994Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7332480Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7332538Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7332838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7332902Z _init_core_state( 2025-12-04T15:04:54.7333388Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7333446Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7333942Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7334008Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7335290Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7335423Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7336681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7336808Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7338065Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7338216Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7339490Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7339615Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7339844Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7339893Z return func(*args, **kwargs) 2025-12-04T15:04:54.7340118Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7340166Z return func(*args, **kwargs) 2025-12-04T15:04:54.7340426Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7340471Z return func(*args, **kwargs) 2025-12-04T15:04:54.7340692Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7340737Z return func(*args, **kwargs) 2025-12-04T15:04:54.7340956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7341001Z return func(*args, **kwargs) 2025-12-04T15:04:54.7341220Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7341265Z return func(*args, **kwargs) 2025-12-04T15:04:54.7341484Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7341529Z return func(*args, **kwargs) 2025-12-04T15:04:54.7341754Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7341798Z return func(*args, **kwargs) 2025-12-04T15:04:54.7342092Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7342161Z return func(*args, **kwargs) 2025-12-04T15:04:54.7342312Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7342477Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7342771Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7342942Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7343232Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7343373Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7343653Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7343806Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7344086Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7344240Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7344520Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7344660Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7344938Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7345089Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7345577Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7345696Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7355616Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7356010Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7356134Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7356410Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7356578Z [rank2]:E1204 14:46:49.617000 424745 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7356623Z dist init r=2, world=4 2025-12-04T15:04:54.7356766Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7356948Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7357242Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7357401Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7357702Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7357830Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7358109Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7358261Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7358542Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7358694Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7358975Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7359114Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7359396Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7359548Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7360046Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7360165Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7360409Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7360796Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7360927Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7361144Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7361308Z [rank3]:E1204 14:46:49.618000 424746 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7361367Z dist init r=3, world=4 2025-12-04T15:04:54.7361507Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7361669Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7361974Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7362128Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7362415Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7362538Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7362818Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7362967Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7363245Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7363392Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7363669Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7363806Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7364085Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7364232Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7364717Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7364849Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7365057Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7365425Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7365537Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7365760Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7365925Z [rank1]:E1204 14:46:49.657000 424744 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7365965Z dist init r=1, world=4 2025-12-04T15:04:54.7366112Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7366270Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7366558Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7366710Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7366995Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7367119Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7367397Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7367546Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7367819Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7367966Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7368244Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7368380Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7368657Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7368807Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7369292Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7369427Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7369623Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7369998Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7370112Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7370367Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7370545Z [rank0]:E1204 14:46:49.669000 424743 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7370585Z dist init r=0, world=4 2025-12-04T15:04:54.7370925Z [rank2]:[W1204 14:46:49.298230006 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7371258Z [rank3]:[W1204 14:46:49.300572560 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7371586Z [rank0]:[W1204 14:46:49.453413938 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7371912Z [rank1]:[W1204 14:46:49.462698576 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7371954Z FAILED [22.8391s] [ 5%] 2025-12-04T15:04:54.7371957Z 2025-12-04T15:04:54.7372019Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7372124Z _ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda _ 2025-12-04T15:04:54.7372172Z Traceback (most recent call last): 2025-12-04T15:04:54.7372338Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7372385Z self._join_processes(fn) 2025-12-04T15:04:54.7372560Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7372616Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7372799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7372844Z raise RuntimeError(error) 2025-12-04T15:04:54.7372927Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7372974Z Traceback (most recent call last): 2025-12-04T15:04:54.7373138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7373197Z getattr(self, test_name)() 2025-12-04T15:04:54.7373357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7373408Z fn() 2025-12-04T15:04:54.7373563Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7373606Z method(*args, **kwargs) 2025-12-04T15:04:54.7373757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7373797Z method(*args, **kwargs) 2025-12-04T15:04:54.7373959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7373998Z with policy(): 2025-12-04T15:04:54.7374152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7374194Z raise RuntimeError(msg) 2025-12-04T15:04:54.7374571Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7374574Z 2025-12-04T15:04:54.7374651Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7374893Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7374896Z 2025-12-04T15:04:54.7374986Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7374990Z 2025-12-04T15:04:54.7374992Z 2025-12-04T15:04:54.7375072Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7375164Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7375400Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d544cccfd778cd2d.xml - 2025-12-04T15:04:54.7375462Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7375717Z FAILED [22.8391s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7375764Z Traceback (most recent call last): 2025-12-04T15:04:54.7375928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7375972Z getattr(self, test_name)() 2025-12-04T15:04:54.7376132Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7376169Z fn() 2025-12-04T15:04:54.7376319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7376360Z method(*args, **kwargs) 2025-12-04T15:04:54.7376511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7376551Z method(*args, **kwargs) 2025-12-04T15:04:54.7376699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7376738Z with policy(): 2025-12-04T15:04:54.7376888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7376930Z raise RuntimeError(msg) 2025-12-04T15:04:54.7377290Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7377315Z 2025-12-04T15:04:54.7377391Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7377633Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7377635Z 2025-12-04T15:04:54.7377722Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7377802Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7377865Z ======================= 1 failed, 7 deselected in 23.00s ======================= 2025-12-04T15:04:54.7377904Z Got exit code 1 2025-12-04T15:04:54.7377944Z Retrying single test... 2025-12-04T15:04:54.7378135Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6ee8b83fadee883c.xml 2025-12-04T15:04:54.7378193Z ============================= test session starts ============================== 2025-12-04T15:04:54.7378326Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7378369Z cachedir: .pytest_cache 2025-12-04T15:04:54.7378527Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7378574Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7378615Z configfile: pytest.ini 2025-12-04T15:04:54.7378778Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7378854Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7379085Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7379133Z Running 1 items in this shard 2025-12-04T15:04:54.7379136Z 2025-12-04T15:04:54.7379451Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda I1204 14:47:06.429000 425940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 426009 2025-12-04T15:04:54.7379607Z I1204 14:47:06.430000 425940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 426010 2025-12-04T15:04:54.7379760Z I1204 14:47:06.430000 425940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 426011 2025-12-04T15:04:54.7379912Z I1204 14:47:06.431000 425940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 426012 2025-12-04T15:04:54.7380525Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7380565Z _warn_cpu_init() 2025-12-04T15:04:54.7380871Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7380910Z _init_core_state( 2025-12-04T15:04:54.7381402Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7381493Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7382077Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7382115Z _warn_cpu_init() 2025-12-04T15:04:54.7382415Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7382455Z _init_core_state( 2025-12-04T15:04:54.7382952Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7383014Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7383581Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7383622Z _warn_cpu_init() 2025-12-04T15:04:54.7383920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7383956Z _init_core_state( 2025-12-04T15:04:54.7384442Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7384500Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7385067Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7385105Z _warn_cpu_init() 2025-12-04T15:04:54.7385589Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7385663Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7385973Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7386012Z _init_core_state( 2025-12-04T15:04:54.7386497Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7386568Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7387056Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7387125Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7387611Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7387669Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7388937Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7389065Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7390365Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7390505Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7391783Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7391908Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7393175Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7393300Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7393534Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7393578Z return func(*args, **kwargs) 2025-12-04T15:04:54.7393806Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7393848Z return func(*args, **kwargs) 2025-12-04T15:04:54.7394072Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7394115Z return func(*args, **kwargs) 2025-12-04T15:04:54.7394338Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7394379Z return func(*args, **kwargs) 2025-12-04T15:04:54.7394603Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7394643Z return func(*args, **kwargs) 2025-12-04T15:04:54.7394867Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7394911Z return func(*args, **kwargs) 2025-12-04T15:04:54.7395141Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7395196Z return func(*args, **kwargs) 2025-12-04T15:04:54.7395415Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7395458Z return func(*args, **kwargs) 2025-12-04T15:04:54.7395747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7395805Z return func(*args, **kwargs) 2025-12-04T15:04:54.7395951Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7396116Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7396417Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7396574Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7396860Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7396986Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7397266Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7397416Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7397692Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7397839Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7398116Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7398253Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7398532Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7398681Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7399170Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7399285Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7399506Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7399875Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7399988Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7400274Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7400439Z [rank1]:E1204 14:47:15.051000 426010 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7400479Z dist init r=1, world=4 2025-12-04T15:04:54.7400619Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7400792Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7401081Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7401235Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7401519Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7401644Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7401921Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7402067Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7402344Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7402491Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7402767Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7402904Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7403180Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7403330Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7403812Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7403959Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7404155Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7404531Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7404648Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7404860Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7405038Z [rank0]:E1204 14:47:15.054000 426009 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7405080Z dist init r=0, world=4 2025-12-04T15:04:54.7405221Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7405381Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7405672Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7405829Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7406120Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7406247Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7406526Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7406676Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7406953Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7407105Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7407379Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7407517Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7407797Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7407957Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7408456Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7408570Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7408780Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7409146Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7409273Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7409486Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7409648Z [rank2]:E1204 14:47:15.057000 426011 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7409695Z dist init r=2, world=4 2025-12-04T15:04:54.7409832Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7409994Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7410316Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7410472Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7410756Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7410884Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7411163Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7411319Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7411599Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7411746Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7412025Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7412161Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7412480Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7412628Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7413127Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7413245Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7413444Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7413825Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7413938Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7414152Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7414315Z [rank3]:E1204 14:47:15.096000 426012 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7414361Z dist init r=3, world=4 2025-12-04T15:04:54.7414698Z [rank1]:[W1204 14:47:15.737743656 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7415029Z [rank2]:[W1204 14:47:15.799434010 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7415359Z [rank0]:[W1204 14:47:15.804429964 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7415688Z [rank3]:[W1204 14:47:15.864223069 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7415733Z FAILED [23.0408s] [100%] 2025-12-04T15:04:54.7415735Z 2025-12-04T15:04:54.7415795Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7415903Z _ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda _ 2025-12-04T15:04:54.7415951Z Traceback (most recent call last): 2025-12-04T15:04:54.7416116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7416164Z self._join_processes(fn) 2025-12-04T15:04:54.7416340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7416407Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7416599Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7416644Z raise RuntimeError(error) 2025-12-04T15:04:54.7416728Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7416774Z Traceback (most recent call last): 2025-12-04T15:04:54.7416937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7416981Z getattr(self, test_name)() 2025-12-04T15:04:54.7417156Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7417193Z fn() 2025-12-04T15:04:54.7417347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7417393Z method(*args, **kwargs) 2025-12-04T15:04:54.7417546Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7417590Z method(*args, **kwargs) 2025-12-04T15:04:54.7417751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7417793Z with policy(): 2025-12-04T15:04:54.7417945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7417990Z raise RuntimeError(msg) 2025-12-04T15:04:54.7418354Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7418357Z 2025-12-04T15:04:54.7418436Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7418678Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7418681Z 2025-12-04T15:04:54.7418772Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7418774Z 2025-12-04T15:04:54.7418834Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7418883Z Traceback (most recent call last): 2025-12-04T15:04:54.7419045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7419090Z getattr(self, test_name)() 2025-12-04T15:04:54.7419248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7419287Z fn() 2025-12-04T15:04:54.7419437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7419481Z method(*args, **kwargs) 2025-12-04T15:04:54.7419631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7419675Z method(*args, **kwargs) 2025-12-04T15:04:54.7419825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7419864Z with policy(): 2025-12-04T15:04:54.7420019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7420060Z raise RuntimeError(msg) 2025-12-04T15:04:54.7420479Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7420511Z 2025-12-04T15:04:54.7420588Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7420831Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7420833Z 2025-12-04T15:04:54.7420921Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7420923Z 2025-12-04T15:04:54.7420985Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7421046Z Traceback (most recent call last): 2025-12-04T15:04:54.7421211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7421254Z getattr(self, test_name)() 2025-12-04T15:04:54.7421415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7421451Z fn() 2025-12-04T15:04:54.7421619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7421660Z method(*args, **kwargs) 2025-12-04T15:04:54.7421811Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7421851Z method(*args, **kwargs) 2025-12-04T15:04:54.7422004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7422043Z with policy(): 2025-12-04T15:04:54.7422195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7422237Z raise RuntimeError(msg) 2025-12-04T15:04:54.7422599Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7422603Z 2025-12-04T15:04:54.7422679Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7422919Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7422921Z 2025-12-04T15:04:54.7423009Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7423011Z 2025-12-04T15:04:54.7423013Z 2025-12-04T15:04:54.7423091Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7423183Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7423417Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6ee8b83fadee883c.xml - 2025-12-04T15:04:54.7423483Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7423741Z FAILED [23.0408s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7423790Z Traceback (most recent call last): 2025-12-04T15:04:54.7423957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7424000Z getattr(self, test_name)() 2025-12-04T15:04:54.7424160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7424210Z fn() 2025-12-04T15:04:54.7424375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7424416Z method(*args, **kwargs) 2025-12-04T15:04:54.7424568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7424609Z method(*args, **kwargs) 2025-12-04T15:04:54.7424759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7424796Z with policy(): 2025-12-04T15:04:54.7424959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7425001Z raise RuntimeError(msg) 2025-12-04T15:04:54.7425363Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7425367Z 2025-12-04T15:04:54.7425451Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7425690Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7425692Z 2025-12-04T15:04:54.7425777Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7425779Z 2025-12-04T15:04:54.7425838Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7425884Z Traceback (most recent call last): 2025-12-04T15:04:54.7426047Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7426090Z getattr(self, test_name)() 2025-12-04T15:04:54.7426251Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7426288Z fn() 2025-12-04T15:04:54.7426437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7426480Z method(*args, **kwargs) 2025-12-04T15:04:54.7426628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7426671Z method(*args, **kwargs) 2025-12-04T15:04:54.7426819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7426858Z with policy(): 2025-12-04T15:04:54.7427008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7427056Z raise RuntimeError(msg) 2025-12-04T15:04:54.7427414Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7427416Z 2025-12-04T15:04:54.7427492Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7427728Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7427730Z 2025-12-04T15:04:54.7427819Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7427821Z 2025-12-04T15:04:54.7427879Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7427948Z Traceback (most recent call last): 2025-12-04T15:04:54.7428111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7428166Z getattr(self, test_name)() 2025-12-04T15:04:54.7428324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7428363Z fn() 2025-12-04T15:04:54.7428512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7428556Z method(*args, **kwargs) 2025-12-04T15:04:54.7428704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7428758Z method(*args, **kwargs) 2025-12-04T15:04:54.7428910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7428949Z with policy(): 2025-12-04T15:04:54.7429103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7429145Z raise RuntimeError(msg) 2025-12-04T15:04:54.7429515Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7429517Z 2025-12-04T15:04:54.7429591Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7429838Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7429840Z 2025-12-04T15:04:54.7429926Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7429995Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7430062Z ====================== 1 failed, 26 deselected in 23.18s ======================= 2025-12-04T15:04:54.7430104Z Got exit code 1 2025-12-04T15:04:54.7430146Z Retrying single test... 2025-12-04T15:04:54.7430400Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-fe56a79a4ba92f05.xml 2025-12-04T15:04:54.7430460Z ============================= test session starts ============================== 2025-12-04T15:04:54.7430577Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7430619Z cachedir: .pytest_cache 2025-12-04T15:04:54.7430780Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7430827Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7430872Z configfile: pytest.ini 2025-12-04T15:04:54.7431036Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7431115Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7431348Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7431394Z Running 1 items in this shard 2025-12-04T15:04:54.7431396Z 2025-12-04T15:04:54.7431715Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda I1204 14:47:32.106000 427206 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 427275 2025-12-04T15:04:54.7431869Z I1204 14:47:32.107000 427206 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 427276 2025-12-04T15:04:54.7432039Z I1204 14:47:32.108000 427206 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 427277 2025-12-04T15:04:54.7432207Z I1204 14:47:32.108000 427206 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 427278 2025-12-04T15:04:54.7432797Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7432836Z _warn_cpu_init() 2025-12-04T15:04:54.7433139Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7433181Z _init_core_state( 2025-12-04T15:04:54.7433683Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7433750Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7434314Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7434355Z _warn_cpu_init() 2025-12-04T15:04:54.7434654Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7434692Z _init_core_state( 2025-12-04T15:04:54.7435179Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7435238Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7435803Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7435843Z _warn_cpu_init() 2025-12-04T15:04:54.7436143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7436181Z _init_core_state( 2025-12-04T15:04:54.7436667Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7436753Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7437329Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7437370Z _warn_cpu_init() 2025-12-04T15:04:54.7437864Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7437928Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7438223Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7438263Z _init_core_state( 2025-12-04T15:04:54.7438747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7438804Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7439293Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7439351Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7439841Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7439899Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7441202Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7441357Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7442644Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7442768Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7444017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7444139Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7445393Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.7445514Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.7445743Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7445787Z return func(*args, **kwargs) 2025-12-04T15:04:54.7446025Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7446080Z return func(*args, **kwargs) 2025-12-04T15:04:54.7446301Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7446344Z return func(*args, **kwargs) 2025-12-04T15:04:54.7446561Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7446604Z return func(*args, **kwargs) 2025-12-04T15:04:54.7446833Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7446879Z return func(*args, **kwargs) 2025-12-04T15:04:54.7447099Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7447154Z return func(*args, **kwargs) 2025-12-04T15:04:54.7447372Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7447415Z return func(*args, **kwargs) 2025-12-04T15:04:54.7447632Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7447678Z return func(*args, **kwargs) 2025-12-04T15:04:54.7447968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7448012Z return func(*args, **kwargs) 2025-12-04T15:04:54.7448157Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7448321Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7448610Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7448765Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7449052Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7449179Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7449462Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7449611Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7449890Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7450037Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7450400Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7450540Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7450817Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7450981Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7451467Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T15:04:54.7451602Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7451800Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7452171Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7452288Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7452501Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7452669Z [rank1]:E1204 14:47:40.743000 427276 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7452711Z dist init r=1, world=4 2025-12-04T15:04:54.7452856Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7453016Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7453306Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7453461Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7453751Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7453877Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7454160Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7454310Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7454600Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7454764Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7455038Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7455178Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7455466Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7455619Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7456118Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7456233Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7456432Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7456797Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7456914Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7457123Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7457288Z [rank0]:E1204 14:47:40.744000 427275 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7457330Z dist init r=0, world=4 2025-12-04T15:04:54.7457468Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7457629Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7457916Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7458071Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7458353Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7458479Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7458756Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7458927Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7459200Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7459347Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7459641Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7459778Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7460070Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7460256Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7460740Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7460854Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7461055Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7461423Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7461536Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7461747Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7461910Z [rank2]:E1204 14:47:40.747000 427277 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7461955Z dist init r=2, world=4 2025-12-04T15:04:54.7462092Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7462254Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7462540Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7462696Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7462980Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7463132Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7463410Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7463558Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7463849Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7463996Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7464273Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7464422Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7464701Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7464851Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7465333Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T15:04:54.7465451Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7465647Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7466016Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7466129Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7466342Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7466510Z [rank3]:E1204 14:47:40.795000 427278 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7466549Z dist init r=3, world=4 2025-12-04T15:04:54.7466887Z [rank1]:[W1204 14:47:40.429871642 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7467217Z [rank0]:[W1204 14:47:40.444374959 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7467559Z [rank2]:[W1204 14:47:40.460335960 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7467895Z [rank3]:[W1204 14:47:41.587806119 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7467940Z FAILED [23.0386s] [100%] 2025-12-04T15:04:54.7467943Z 2025-12-04T15:04:54.7468002Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7468120Z _ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda _ 2025-12-04T15:04:54.7468172Z Traceback (most recent call last): 2025-12-04T15:04:54.7468338Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7468386Z self._join_processes(fn) 2025-12-04T15:04:54.7468568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7468625Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7468804Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7468851Z raise RuntimeError(error) 2025-12-04T15:04:54.7468931Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7468978Z Traceback (most recent call last): 2025-12-04T15:04:54.7469140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7469185Z getattr(self, test_name)() 2025-12-04T15:04:54.7469344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7469382Z fn() 2025-12-04T15:04:54.7469533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7469577Z method(*args, **kwargs) 2025-12-04T15:04:54.7469727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7469770Z method(*args, **kwargs) 2025-12-04T15:04:54.7469919Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7469959Z with policy(): 2025-12-04T15:04:54.7470110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7470154Z raise RuntimeError(msg) 2025-12-04T15:04:54.7470573Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7470576Z 2025-12-04T15:04:54.7470654Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7470893Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7470899Z 2025-12-04T15:04:54.7470986Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7470989Z 2025-12-04T15:04:54.7471050Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7471093Z Traceback (most recent call last): 2025-12-04T15:04:54.7471271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7471327Z getattr(self, test_name)() 2025-12-04T15:04:54.7471486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7471521Z fn() 2025-12-04T15:04:54.7471673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7471715Z method(*args, **kwargs) 2025-12-04T15:04:54.7471866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7471907Z method(*args, **kwargs) 2025-12-04T15:04:54.7472075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7472114Z with policy(): 2025-12-04T15:04:54.7472267Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7472309Z raise RuntimeError(msg) 2025-12-04T15:04:54.7472683Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7472686Z 2025-12-04T15:04:54.7472759Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7473000Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7473002Z 2025-12-04T15:04:54.7473091Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7473094Z 2025-12-04T15:04:54.7473095Z 2025-12-04T15:04:54.7473172Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7473261Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7473496Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-fe56a79a4ba92f05.xml - 2025-12-04T15:04:54.7473560Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7473813Z FAILED [23.0386s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7473862Z Traceback (most recent call last): 2025-12-04T15:04:54.7474023Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7474067Z getattr(self, test_name)() 2025-12-04T15:04:54.7474225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7474264Z fn() 2025-12-04T15:04:54.7474413Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7474455Z method(*args, **kwargs) 2025-12-04T15:04:54.7474605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7474649Z method(*args, **kwargs) 2025-12-04T15:04:54.7474796Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7474835Z with policy(): 2025-12-04T15:04:54.7474986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7475041Z raise RuntimeError(msg) 2025-12-04T15:04:54.7475418Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T15:04:54.7475423Z 2025-12-04T15:04:54.7475494Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7475732Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7475734Z 2025-12-04T15:04:54.7475830Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7475832Z 2025-12-04T15:04:54.7475893Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7475938Z Traceback (most recent call last): 2025-12-04T15:04:54.7476102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7476145Z getattr(self, test_name)() 2025-12-04T15:04:54.7476314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7476348Z fn() 2025-12-04T15:04:54.7476500Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7476540Z method(*args, **kwargs) 2025-12-04T15:04:54.7476689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7476733Z method(*args, **kwargs) 2025-12-04T15:04:54.7476884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7476922Z with policy(): 2025-12-04T15:04:54.7477074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7477116Z raise RuntimeError(msg) 2025-12-04T15:04:54.7477477Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T15:04:54.7477479Z 2025-12-04T15:04:54.7477556Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7477794Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7477796Z 2025-12-04T15:04:54.7477885Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7477950Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7478018Z ====================== 1 failed, 26 deselected in 23.20s ======================= 2025-12-04T15:04:54.7478056Z Got exit code 1 2025-12-04T15:04:54.7478248Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.7478376Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7478566Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-01de3958bcb91344.xml 2025-12-04T15:04:54.7478625Z ============================= test session starts ============================== 2025-12-04T15:04:54.7478743Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7478800Z cachedir: .pytest_cache 2025-12-04T15:04:54.7478960Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7479019Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7479064Z configfile: pytest.ini 2025-12-04T15:04:54.7479227Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7479303Z collecting ... collected 60 items / 8 deselected / 52 selected 2025-12-04T15:04:54.7479355Z stepcurrent: skipping 8 already run items. 2025-12-04T15:04:54.7479403Z Running 19 items in this shard 2025-12-04T15:04:54.7479405Z 2025-12-04T15:04:54.7479752Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 14:47:57.568000 428472 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 428541 2025-12-04T15:04:54.7479910Z I1204 14:47:57.569000 428472 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 428542 2025-12-04T15:04:54.7480075Z I1204 14:47:57.570000 428472 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 428543 2025-12-04T15:04:54.7480268Z I1204 14:47:57.570000 428472 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 428544 2025-12-04T15:04:54.7480846Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7480883Z _warn_cpu_init() 2025-12-04T15:04:54.7481379Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7481443Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7482016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7482059Z _warn_cpu_init() 2025-12-04T15:04:54.7482545Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7482608Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7483173Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7483233Z _warn_cpu_init() 2025-12-04T15:04:54.7483736Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7483795Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7484381Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7484421Z _warn_cpu_init() 2025-12-04T15:04:54.7484728Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7484814Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7485303Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7485365Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7485650Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7485737Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7486020Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7486102Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7486596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7486657Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7487143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7487199Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7487490Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7487568Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7488070Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7488140Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7488428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7488517Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7488806Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7488884Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7489178Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7489263Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7489547Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7489623Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7489912Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7489959Z return func(*args, **kwargs) 2025-12-04T15:04:54.7490219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7490265Z return func(*args, **kwargs) 2025-12-04T15:04:54.7490485Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7490530Z return func(*args, **kwargs) 2025-12-04T15:04:54.7490750Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7490794Z return func(*args, **kwargs) 2025-12-04T15:04:54.7491016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7491059Z return func(*args, **kwargs) 2025-12-04T15:04:54.7491282Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7491323Z return func(*args, **kwargs) 2025-12-04T15:04:54.7491543Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7491584Z return func(*args, **kwargs) 2025-12-04T15:04:54.7491803Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7491844Z return func(*args, **kwargs) 2025-12-04T15:04:54.7492088Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7492145Z return func(*args, **kwargs) 2025-12-04T15:04:54.7492293Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7492455Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7492761Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7492918Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7493207Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7493352Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7493634Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7493785Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7494064Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7494215Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7494493Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7494632Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7494911Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7495062Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7495581Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7495698Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7495898Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7496292Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7496432Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7496644Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7496807Z [rank2]:E1204 14:48:30.331000 428543 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7496850Z dist init r=2, world=4 2025-12-04T15:04:54.7496987Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7497158Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7497444Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7497610Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7497894Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7498018Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7498294Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7498446Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7498725Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7498871Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7499148Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7499284Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7499564Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7499714Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7500289Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7500410Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7500605Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7501026Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7501140Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7501352Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7501538Z [rank0]:E1204 14:48:30.336000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7501581Z dist init r=0, world=4 2025-12-04T15:04:54.7501718Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7501897Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7502182Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7502335Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7502620Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7502744Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7503027Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7503176Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7503451Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7503598Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7503874Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7504016Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7504291Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7504441Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7504945Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7505084Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7505280Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7505670Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7505797Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7506005Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7506187Z [rank1]:E1204 14:48:30.385000 428542 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7506225Z dist init r=1, world=4 2025-12-04T15:04:54.7506366Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7506524Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7506810Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7506963Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7507250Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7507372Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7507649Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7507799Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7508074Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7508223Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7508498Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7508636Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7508910Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7509059Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7509586Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7509700Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7509912Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7510349Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7510466Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7510689Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7510854Z [rank3]:E1204 14:48:30.387000 428544 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7510896Z dist init r=3, world=4 2025-12-04T15:04:54.7511234Z [rank0]:[W1204 14:48:30.053276160 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7511566Z [rank2]:[W1204 14:48:30.058905096 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7511893Z [rank1]:[W1204 14:48:30.151036344 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7512222Z [rank3]:[W1204 14:48:30.188606715 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7512263Z FAILED [47.1703s] [ 5%] 2025-12-04T15:04:54.7512265Z 2025-12-04T15:04:54.7512324Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7512455Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.7512506Z Traceback (most recent call last): 2025-12-04T15:04:54.7512670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7512720Z self._join_processes(fn) 2025-12-04T15:04:54.7512899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7512952Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7513133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7513179Z raise RuntimeError(error) 2025-12-04T15:04:54.7513261Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7513323Z Traceback (most recent call last): 2025-12-04T15:04:54.7513499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7513543Z getattr(self, test_name)() 2025-12-04T15:04:54.7513702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7513738Z fn() 2025-12-04T15:04:54.7513892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7513934Z method(*args, **kwargs) 2025-12-04T15:04:54.7514100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7514142Z method(*args, **kwargs) 2025-12-04T15:04:54.7514294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7514335Z with policy(): 2025-12-04T15:04:54.7514489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7514543Z raise RuntimeError(msg) 2025-12-04T15:04:54.7514933Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7514936Z 2025-12-04T15:04:54.7515010Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7515276Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7515279Z 2025-12-04T15:04:54.7515366Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7515371Z 2025-12-04T15:04:54.7515373Z 2025-12-04T15:04:54.7515450Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7515539Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7515770Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-01de3958bcb91344.xml - 2025-12-04T15:04:54.7515834Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7516110Z FAILED [47.1703s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7516162Z Traceback (most recent call last): 2025-12-04T15:04:54.7516326Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7516371Z getattr(self, test_name)() 2025-12-04T15:04:54.7516529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7516567Z fn() 2025-12-04T15:04:54.7516715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7516758Z method(*args, **kwargs) 2025-12-04T15:04:54.7516906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7516950Z method(*args, **kwargs) 2025-12-04T15:04:54.7517099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7517139Z with policy(): 2025-12-04T15:04:54.7517301Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7517356Z raise RuntimeError(msg) 2025-12-04T15:04:54.7517737Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7517741Z 2025-12-04T15:04:54.7517816Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7518089Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7518091Z 2025-12-04T15:04:54.7518179Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7518246Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7518310Z ======================= 1 failed, 8 deselected in 47.33s ======================= 2025-12-04T15:04:54.7518350Z Got exit code 1 2025-12-04T15:04:54.7518401Z Retrying single test... 2025-12-04T15:04:54.7518592Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2d64d19acd11a4dd.xml 2025-12-04T15:04:54.7518650Z ============================= test session starts ============================== 2025-12-04T15:04:54.7518764Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7518805Z cachedir: .pytest_cache 2025-12-04T15:04:54.7518966Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7519011Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7519055Z configfile: pytest.ini 2025-12-04T15:04:54.7519219Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7519296Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7519550Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7519600Z Running 1 items in this shard 2025-12-04T15:04:54.7519602Z 2025-12-04T15:04:54.7519935Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 14:48:47.181000 429882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 429951 2025-12-04T15:04:54.7520091Z I1204 14:48:47.181000 429882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 429952 2025-12-04T15:04:54.7520283Z I1204 14:48:47.182000 429882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 429953 2025-12-04T15:04:54.7520435Z I1204 14:48:47.182000 429882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 429954 2025-12-04T15:04:54.7521013Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7521052Z _warn_cpu_init() 2025-12-04T15:04:54.7521544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7521636Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7522226Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7522267Z _warn_cpu_init() 2025-12-04T15:04:54.7522770Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7522836Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7523399Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7523441Z _warn_cpu_init() 2025-12-04T15:04:54.7523930Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7523989Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7524561Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7524599Z _warn_cpu_init() 2025-12-04T15:04:54.7524889Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7524974Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7525261Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7525344Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7525831Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7525912Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7526395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7526455Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7526753Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7526836Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7527136Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7527216Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7527502Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7527577Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7528067Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7528126Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7528414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7528487Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7528779Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7528825Z return func(*args, **kwargs) 2025-12-04T15:04:54.7529112Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7529196Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7529685Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7529743Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7530027Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7530115Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7530389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7530436Z return func(*args, **kwargs) 2025-12-04T15:04:54.7530658Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7530703Z return func(*args, **kwargs) 2025-12-04T15:04:54.7530942Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7530985Z return func(*args, **kwargs) 2025-12-04T15:04:54.7531206Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7531249Z return func(*args, **kwargs) 2025-12-04T15:04:54.7531485Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7531526Z return func(*args, **kwargs) 2025-12-04T15:04:54.7531746Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7531787Z return func(*args, **kwargs) 2025-12-04T15:04:54.7532008Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7532049Z return func(*args, **kwargs) 2025-12-04T15:04:54.7532268Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7532312Z return func(*args, **kwargs) 2025-12-04T15:04:54.7532460Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7532622Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7532913Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7533067Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7533353Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7533483Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7533761Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7533912Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7534187Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7534335Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7534644Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7534785Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7535065Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7535222Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7535742Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7535859Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7536058Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7536449Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7536565Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7536779Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7536943Z [rank3]:E1204 14:49:19.837000 429954 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7536985Z dist init r=3, world=4 2025-12-04T15:04:54.7537123Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7537285Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7537569Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7537724Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7538008Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7538134Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7538412Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7538562Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7538867Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7539014Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7539291Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7539441Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7539723Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7539873Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7540432Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7540550Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7540749Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7541149Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7541262Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7541475Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7541637Z [rank2]:E1204 14:49:19.842000 429953 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7541678Z dist init r=2, world=4 2025-12-04T15:04:54.7541815Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7541975Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7542262Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7542414Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7542698Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7542822Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7543112Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7543277Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7543552Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7543711Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7543991Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7544132Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7544420Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7544568Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7545070Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7545186Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7545380Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7545769Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7545887Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7546096Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7546264Z [rank1]:E1204 14:49:19.883000 429952 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7546305Z dist init r=1, world=4 2025-12-04T15:04:54.7546446Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7546604Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7546891Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7547044Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7547340Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7547478Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7547756Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7547906Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7548191Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7548342Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7548627Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7548767Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7549043Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7549192Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7549699Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7549812Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7550007Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7550437Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7550553Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7550762Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7550926Z [rank0]:E1204 14:49:19.896000 429951 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7550966Z dist init r=0, world=4 2025-12-04T15:04:54.7551300Z [rank3]:[W1204 14:49:20.526442005 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7551629Z [rank2]:[W1204 14:49:20.531885942 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7551990Z [rank1]:[W1204 14:49:20.649921523 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7552316Z [rank0]:[W1204 14:49:20.697712819 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7552370Z FAILED [47.0721s] [100%] 2025-12-04T15:04:54.7552372Z 2025-12-04T15:04:54.7552431Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7552561Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.7552612Z Traceback (most recent call last): 2025-12-04T15:04:54.7552789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7552834Z self._join_processes(fn) 2025-12-04T15:04:54.7553007Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7553060Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7553238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7553282Z raise RuntimeError(error) 2025-12-04T15:04:54.7553363Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7553408Z Traceback (most recent call last): 2025-12-04T15:04:54.7553569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7553611Z getattr(self, test_name)() 2025-12-04T15:04:54.7553770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7553805Z fn() 2025-12-04T15:04:54.7553956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7553998Z method(*args, **kwargs) 2025-12-04T15:04:54.7554148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7554189Z method(*args, **kwargs) 2025-12-04T15:04:54.7554339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7554376Z with policy(): 2025-12-04T15:04:54.7554528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7554570Z raise RuntimeError(msg) 2025-12-04T15:04:54.7554953Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7554956Z 2025-12-04T15:04:54.7555031Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7555292Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7555295Z 2025-12-04T15:04:54.7555385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7555408Z 2025-12-04T15:04:54.7555480Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7555529Z Traceback (most recent call last): 2025-12-04T15:04:54.7555690Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7555733Z getattr(self, test_name)() 2025-12-04T15:04:54.7555889Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7555926Z fn() 2025-12-04T15:04:54.7556075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7556130Z method(*args, **kwargs) 2025-12-04T15:04:54.7556280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7556322Z method(*args, **kwargs) 2025-12-04T15:04:54.7556471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7556511Z with policy(): 2025-12-04T15:04:54.7556672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7556715Z raise RuntimeError(msg) 2025-12-04T15:04:54.7557095Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7557098Z 2025-12-04T15:04:54.7557175Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7557437Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7557441Z 2025-12-04T15:04:54.7557529Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7557531Z 2025-12-04T15:04:54.7557534Z 2025-12-04T15:04:54.7557611Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7557699Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7557931Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2d64d19acd11a4dd.xml - 2025-12-04T15:04:54.7557992Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7558268Z FAILED [47.0721s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7558315Z Traceback (most recent call last): 2025-12-04T15:04:54.7558479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7558522Z getattr(self, test_name)() 2025-12-04T15:04:54.7558682Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7558717Z fn() 2025-12-04T15:04:54.7558869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7558909Z method(*args, **kwargs) 2025-12-04T15:04:54.7559061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7559102Z method(*args, **kwargs) 2025-12-04T15:04:54.7559252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7559314Z with policy(): 2025-12-04T15:04:54.7559466Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7559508Z raise RuntimeError(msg) 2025-12-04T15:04:54.7559889Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7559891Z 2025-12-04T15:04:54.7559965Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7560276Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7560280Z 2025-12-04T15:04:54.7560368Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7560371Z 2025-12-04T15:04:54.7560431Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7560493Z Traceback (most recent call last): 2025-12-04T15:04:54.7560654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7560697Z getattr(self, test_name)() 2025-12-04T15:04:54.7560854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7560891Z fn() 2025-12-04T15:04:54.7561041Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7561083Z method(*args, **kwargs) 2025-12-04T15:04:54.7561232Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7561274Z method(*args, **kwargs) 2025-12-04T15:04:54.7561423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7561461Z with policy(): 2025-12-04T15:04:54.7561611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7561655Z raise RuntimeError(msg) 2025-12-04T15:04:54.7562037Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7562042Z 2025-12-04T15:04:54.7562114Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7562372Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7562375Z 2025-12-04T15:04:54.7562461Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7562528Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7562591Z ====================== 1 failed, 26 deselected in 47.23s ======================= 2025-12-04T15:04:54.7562630Z Got exit code 1 2025-12-04T15:04:54.7562670Z Retrying single test... 2025-12-04T15:04:54.7562863Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-46bd4f48f3ceba23.xml 2025-12-04T15:04:54.7562921Z ============================= test session starts ============================== 2025-12-04T15:04:54.7563033Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7563100Z cachedir: .pytest_cache 2025-12-04T15:04:54.7563259Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7563305Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7563346Z configfile: pytest.ini 2025-12-04T15:04:54.7563507Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7563582Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7563845Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7563891Z Running 1 items in this shard 2025-12-04T15:04:54.7563893Z 2025-12-04T15:04:54.7564225Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 14:49:36.688000 431292 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 431361 2025-12-04T15:04:54.7564392Z I1204 14:49:36.689000 431292 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 431362 2025-12-04T15:04:54.7564548Z I1204 14:49:36.690000 431292 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 431363 2025-12-04T15:04:54.7564696Z I1204 14:49:36.690000 431292 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 431364 2025-12-04T15:04:54.7565277Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7565317Z _warn_cpu_init() 2025-12-04T15:04:54.7565806Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7565869Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7566439Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7566481Z _warn_cpu_init() 2025-12-04T15:04:54.7567045Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7567085Z _warn_cpu_init() 2025-12-04T15:04:54.7567571Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7567658Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7568139Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7568212Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7568795Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7568835Z _warn_cpu_init() 2025-12-04T15:04:54.7569126Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7569209Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7569701Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7569761Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7570046Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7570128Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7570640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7570698Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7570985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7571067Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7571557Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7571615Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7571902Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7572007Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7572295Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7572370Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7572667Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7572740Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7573239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7573301Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7573586Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7573668Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T15:04:54.7573953Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7574027Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.7574315Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7574360Z return func(*args, **kwargs) 2025-12-04T15:04:54.7574586Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7574629Z return func(*args, **kwargs) 2025-12-04T15:04:54.7574851Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7574894Z return func(*args, **kwargs) 2025-12-04T15:04:54.7575114Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7575158Z return func(*args, **kwargs) 2025-12-04T15:04:54.7575380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7575421Z return func(*args, **kwargs) 2025-12-04T15:04:54.7575639Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7575679Z return func(*args, **kwargs) 2025-12-04T15:04:54.7575898Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7575939Z return func(*args, **kwargs) 2025-12-04T15:04:54.7576173Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7576224Z return func(*args, **kwargs) 2025-12-04T15:04:54.7576442Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7576482Z return func(*args, **kwargs) 2025-12-04T15:04:54.7576629Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7576802Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7577093Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7577248Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7577544Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7577671Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7577948Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7578096Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7578378Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7578528Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7578802Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7578940Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7579216Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7579366Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7579876Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7579991Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7580224Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7580628Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7580757Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7580968Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7581144Z [rank0]:E1204 14:50:09.434000 431361 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7581185Z dist init r=0, world=4 2025-12-04T15:04:54.7581324Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7581484Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7581784Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7581939Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7582222Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7582348Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7582624Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7582773Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7583052Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7583199Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7583477Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7583613Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7583895Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7584042Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7584547Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7584675Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7584881Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7585272Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7585402Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7585613Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7585776Z [rank3]:E1204 14:50:09.439000 431364 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7585817Z dist init r=3, world=4 2025-12-04T15:04:54.7585965Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7586126Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7586410Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7586565Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7586848Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7586973Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7587251Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7587398Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7587678Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7587826Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7588106Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7588243Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7588518Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7588666Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7589172Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7589308Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7589501Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7589906Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7590021Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7590284Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7590448Z [rank2]:E1204 14:50:09.493000 431363 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7590487Z dist init r=2, world=4 2025-12-04T15:04:54.7590625Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7590783Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7591069Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7591224Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7591509Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7591633Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7591916Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7592065Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7592343Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7592489Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7592765Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7592903Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7593179Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7593359Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7593865Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7593991Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7594187Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7594586Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7594699Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7594907Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7595071Z [rank1]:E1204 14:50:09.500000 431362 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7595109Z dist init r=1, world=4 2025-12-04T15:04:54.7595444Z [rank0]:[W1204 14:50:09.114152709 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7595774Z [rank3]:[W1204 14:50:09.124379269 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7596099Z [rank1]:[W1204 14:50:09.250491476 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7596424Z [rank2]:[W1204 14:50:09.262354837 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7596465Z FAILED [46.9720s] [100%] 2025-12-04T15:04:54.7596467Z 2025-12-04T15:04:54.7596527Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7596654Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.7596704Z Traceback (most recent call last): 2025-12-04T15:04:54.7596865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7596911Z self._join_processes(fn) 2025-12-04T15:04:54.7597084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7597137Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7597315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7597383Z raise RuntimeError(error) 2025-12-04T15:04:54.7597464Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7597511Z Traceback (most recent call last): 2025-12-04T15:04:54.7600664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7600713Z getattr(self, test_name)() 2025-12-04T15:04:54.7600876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7600912Z fn() 2025-12-04T15:04:54.7601095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7601139Z method(*args, **kwargs) 2025-12-04T15:04:54.7601291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7601336Z method(*args, **kwargs) 2025-12-04T15:04:54.7601502Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7601542Z with policy(): 2025-12-04T15:04:54.7601697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7601737Z raise RuntimeError(msg) 2025-12-04T15:04:54.7602128Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7602130Z 2025-12-04T15:04:54.7602207Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7602470Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7602474Z 2025-12-04T15:04:54.7602564Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7602568Z 2025-12-04T15:04:54.7602570Z 2025-12-04T15:04:54.7602648Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7602738Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7602974Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-46bd4f48f3ceba23.xml - 2025-12-04T15:04:54.7603039Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7603316Z FAILED [46.9720s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7603369Z Traceback (most recent call last): 2025-12-04T15:04:54.7603536Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7603581Z getattr(self, test_name)() 2025-12-04T15:04:54.7603738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7603776Z fn() 2025-12-04T15:04:54.7603925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7603972Z method(*args, **kwargs) 2025-12-04T15:04:54.7604121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7604183Z method(*args, **kwargs) 2025-12-04T15:04:54.7604333Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7606411Z with policy(): 2025-12-04T15:04:54.7606566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7606610Z raise RuntimeError(msg) 2025-12-04T15:04:54.7606994Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7607015Z 2025-12-04T15:04:54.7607091Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7607354Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7607359Z 2025-12-04T15:04:54.7607446Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7607524Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7607589Z ====================== 1 failed, 26 deselected in 47.13s ======================= 2025-12-04T15:04:54.7607629Z Got exit code 1 2025-12-04T15:04:54.7607836Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T15:04:54.7607966Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7608152Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-307f54560be92057.xml 2025-12-04T15:04:54.7608211Z ============================= test session starts ============================== 2025-12-04T15:04:54.7608326Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7608369Z cachedir: .pytest_cache 2025-12-04T15:04:54.7608530Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7608576Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7608616Z configfile: pytest.ini 2025-12-04T15:04:54.7608779Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7608853Z collecting ... collected 60 items / 9 deselected / 51 selected 2025-12-04T15:04:54.7608907Z stepcurrent: skipping 9 already run items. 2025-12-04T15:04:54.7608951Z Running 18 items in this shard 2025-12-04T15:04:54.7608953Z 2025-12-04T15:04:54.7609289Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 14:50:26.306000 432702 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 432771 2025-12-04T15:04:54.7609447Z I1204 14:50:26.307000 432702 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 432772 2025-12-04T15:04:54.7609599Z I1204 14:50:26.307000 432702 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 432773 2025-12-04T15:04:54.7609751Z I1204 14:50:26.308000 432702 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 432774 2025-12-04T15:04:54.7610368Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7610439Z _warn_cpu_init() 2025-12-04T15:04:54.7610734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7610775Z _init_core_state( 2025-12-04T15:04:54.7611281Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7611346Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7611923Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7611962Z _warn_cpu_init() 2025-12-04T15:04:54.7612253Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7612290Z _init_core_state( 2025-12-04T15:04:54.7612777Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7612838Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7613403Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7613440Z _warn_cpu_init() 2025-12-04T15:04:54.7613732Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7613773Z _init_core_state( 2025-12-04T15:04:54.7614255Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7614315Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7614875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7614933Z _warn_cpu_init() 2025-12-04T15:04:54.7615421Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7615477Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7615971Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7616029Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7616330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7616369Z _init_core_state( 2025-12-04T15:04:54.7616851Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7616909Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7617196Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7617240Z return func(*args, **kwargs) 2025-12-04T15:04:54.7617721Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7617780Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7618007Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7618051Z return func(*args, **kwargs) 2025-12-04T15:04:54.7618276Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7618319Z return func(*args, **kwargs) 2025-12-04T15:04:54.7618538Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7618578Z return func(*args, **kwargs) 2025-12-04T15:04:54.7618797Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7618836Z return func(*args, **kwargs) 2025-12-04T15:04:54.7619054Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7619113Z return func(*args, **kwargs) 2025-12-04T15:04:54.7619332Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7619373Z return func(*args, **kwargs) 2025-12-04T15:04:54.7619591Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7619632Z return func(*args, **kwargs) 2025-12-04T15:04:54.7619860Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7619901Z return func(*args, **kwargs) 2025-12-04T15:04:54.7620050Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7620258Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7620563Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7620719Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7621005Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7621131Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7621409Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7621558Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7621834Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7621982Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7622258Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7622398Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7622680Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7622827Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7623333Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7623480Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7623676Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7624057Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7624184Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7624397Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7624562Z [rank2]:E1204 14:50:59.123000 432773 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7624611Z dist init r=2, world=4 2025-12-04T15:04:54.7624751Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7624909Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7625194Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7625347Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7625630Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7625753Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7626029Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7626177Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7626452Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7626598Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7626872Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7627010Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7627289Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7627435Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7627944Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7628069Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7628261Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7628650Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7628764Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7628984Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7629146Z [rank1]:E1204 14:50:59.136000 432772 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7629187Z dist init r=1, world=4 2025-12-04T15:04:54.7629325Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7629484Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7629770Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7629923Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7630240Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7630362Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7630638Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7630785Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7631062Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7631206Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7631481Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7631620Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7631910Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7632071Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7632579Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7632695Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7632890Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7633285Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7633398Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7633608Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7633771Z [rank3]:E1204 14:50:59.193000 432774 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7633811Z dist init r=3, world=4 2025-12-04T15:04:54.7633949Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7634111Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7634400Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7634552Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7634836Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7634961Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7635238Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7635387Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7635661Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7635809Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7636083Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7636239Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7636518Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7636668Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7637180Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7637313Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7637507Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7637886Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7638000Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7638209Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7638376Z [rank0]:E1204 14:50:59.206000 432771 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7638417Z dist init r=0, world=4 2025-12-04T15:04:54.7638756Z [rank2]:[W1204 14:50:59.806972583 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7639087Z [rank1]:[W1204 14:50:59.828557077 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7639414Z [rank3]:[W1204 14:50:59.086913263 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7639743Z [rank0]:[W1204 14:50:59.102479317 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7639784Z FAILED [47.1747s] [ 5%] 2025-12-04T15:04:54.7639786Z 2025-12-04T15:04:54.7639846Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7639968Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _ 2025-12-04T15:04:54.7640016Z Traceback (most recent call last): 2025-12-04T15:04:54.7640207Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7640274Z self._join_processes(fn) 2025-12-04T15:04:54.7640448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7640503Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7640681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7640725Z raise RuntimeError(error) 2025-12-04T15:04:54.7640804Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7640848Z Traceback (most recent call last): 2025-12-04T15:04:54.7641020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7641062Z getattr(self, test_name)() 2025-12-04T15:04:54.7641219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7641255Z fn() 2025-12-04T15:04:54.7641420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7641461Z method(*args, **kwargs) 2025-12-04T15:04:54.7641610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7641649Z method(*args, **kwargs) 2025-12-04T15:04:54.7641797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7641834Z with policy(): 2025-12-04T15:04:54.7641986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7642030Z raise RuntimeError(msg) 2025-12-04T15:04:54.7642409Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7642412Z 2025-12-04T15:04:54.7642486Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7642740Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7642743Z 2025-12-04T15:04:54.7642831Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7642833Z 2025-12-04T15:04:54.7642892Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7642938Z Traceback (most recent call last): 2025-12-04T15:04:54.7643098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7643142Z getattr(self, test_name)() 2025-12-04T15:04:54.7643299Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7643335Z fn() 2025-12-04T15:04:54.7643484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7643524Z method(*args, **kwargs) 2025-12-04T15:04:54.7643672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7643712Z method(*args, **kwargs) 2025-12-04T15:04:54.7643861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7643899Z with policy(): 2025-12-04T15:04:54.7644049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7644113Z raise RuntimeError(msg) 2025-12-04T15:04:54.7644486Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7644488Z 2025-12-04T15:04:54.7644561Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7644823Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7644824Z 2025-12-04T15:04:54.7644916Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7644919Z 2025-12-04T15:04:54.7644921Z 2025-12-04T15:04:54.7645000Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7645089Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7645335Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-307f54560be92057.xml - 2025-12-04T15:04:54.7645397Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7645672Z FAILED [47.1747s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7645721Z Traceback (most recent call last): 2025-12-04T15:04:54.7645886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7645930Z getattr(self, test_name)() 2025-12-04T15:04:54.7646089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7646127Z fn() 2025-12-04T15:04:54.7646280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7646320Z method(*args, **kwargs) 2025-12-04T15:04:54.7646471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7646511Z method(*args, **kwargs) 2025-12-04T15:04:54.7646661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7646699Z with policy(): 2025-12-04T15:04:54.7646852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7646895Z raise RuntimeError(msg) 2025-12-04T15:04:54.7647271Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7647273Z 2025-12-04T15:04:54.7647349Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7647601Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7647603Z 2025-12-04T15:04:54.7647692Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7647694Z 2025-12-04T15:04:54.7647752Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7647810Z Traceback (most recent call last): 2025-12-04T15:04:54.7647971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7648026Z getattr(self, test_name)() 2025-12-04T15:04:54.7648185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7648224Z fn() 2025-12-04T15:04:54.7648372Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7648415Z method(*args, **kwargs) 2025-12-04T15:04:54.7648563Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7648618Z method(*args, **kwargs) 2025-12-04T15:04:54.7648768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7648808Z with policy(): 2025-12-04T15:04:54.7648959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7649003Z raise RuntimeError(msg) 2025-12-04T15:04:54.7649384Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7649389Z 2025-12-04T15:04:54.7649462Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7649719Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7649721Z 2025-12-04T15:04:54.7649807Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7649875Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7649938Z ======================= 1 failed, 9 deselected in 47.34s ======================= 2025-12-04T15:04:54.7649981Z Got exit code 1 2025-12-04T15:04:54.7650022Z Retrying single test... 2025-12-04T15:04:54.7650252Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8883f8c783b3fd2a.xml 2025-12-04T15:04:54.7650313Z ============================= test session starts ============================== 2025-12-04T15:04:54.7650430Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7650473Z cachedir: .pytest_cache 2025-12-04T15:04:54.7650633Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7650680Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7650723Z configfile: pytest.ini 2025-12-04T15:04:54.7650884Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7650961Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7651206Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7651253Z Running 1 items in this shard 2025-12-04T15:04:54.7651255Z 2025-12-04T15:04:54.7651581Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 14:51:16.198000 434112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 434181 2025-12-04T15:04:54.7651736Z I1204 14:51:16.198000 434112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 434182 2025-12-04T15:04:54.7651918Z I1204 14:51:16.199000 434112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 434183 2025-12-04T15:04:54.7652067Z I1204 14:51:16.199000 434112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 434184 2025-12-04T15:04:54.7652666Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7652707Z _warn_cpu_init() 2025-12-04T15:04:54.7653005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7653045Z _init_core_state( 2025-12-04T15:04:54.7653548Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7653610Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7654175Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7654215Z _warn_cpu_init() 2025-12-04T15:04:54.7654772Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7654813Z _warn_cpu_init() 2025-12-04T15:04:54.7655107Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7655147Z _init_core_state( 2025-12-04T15:04:54.7655632Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7655692Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7655983Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7656022Z _init_core_state( 2025-12-04T15:04:54.7656506Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7656585Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7657159Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7657199Z _warn_cpu_init() 2025-12-04T15:04:54.7657696Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7657757Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7658048Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7658087Z _init_core_state( 2025-12-04T15:04:54.7658571Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7658632Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7659122Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7659179Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7659466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7659509Z return func(*args, **kwargs) 2025-12-04T15:04:54.7659995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7660053Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7660316Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7660360Z return func(*args, **kwargs) 2025-12-04T15:04:54.7660585Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7660640Z return func(*args, **kwargs) 2025-12-04T15:04:54.7660871Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7660913Z return func(*args, **kwargs) 2025-12-04T15:04:54.7661133Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7661174Z return func(*args, **kwargs) 2025-12-04T15:04:54.7661391Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7661445Z return func(*args, **kwargs) 2025-12-04T15:04:54.7661661Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7661704Z return func(*args, **kwargs) 2025-12-04T15:04:54.7661933Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7661974Z return func(*args, **kwargs) 2025-12-04T15:04:54.7662191Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7662231Z return func(*args, **kwargs) 2025-12-04T15:04:54.7662373Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7662537Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7662823Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7662979Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7663263Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7663386Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7663662Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7663808Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7664087Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7664232Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7664508Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7664643Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7664929Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7665087Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7665598Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7665714Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7665909Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7666304Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7666419Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7666629Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7666793Z [rank3]:E1204 14:51:49.015000 434184 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7666832Z dist init r=3, world=4 2025-12-04T15:04:54.7666972Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7667133Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7667420Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7667572Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7667854Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7667977Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7668253Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7668400Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7668677Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7668825Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7669099Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7669264Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7669538Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7669685Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7670230Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7670346Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7670554Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7670934Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7671048Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7671257Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7671423Z [rank2]:E1204 14:51:49.069000 434183 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7671461Z dist init r=2, world=4 2025-12-04T15:04:54.7671598Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7671755Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7672041Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7672194Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7672477Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7672601Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7672874Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7673022Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7673301Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7673472Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7673746Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7673881Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7674165Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7674311Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7674818Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7674931Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7675126Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7675504Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7675619Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7675829Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7675991Z [rank0]:E1204 14:51:49.091000 434181 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7676030Z dist init r=0, world=4 2025-12-04T15:04:54.7676167Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7676324Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7676608Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7676763Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7677045Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7677170Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7677445Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7677610Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7677886Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7678031Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7678315Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7678450Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7678727Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7678884Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7679382Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7679495Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7679689Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7680069Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7680225Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7680438Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7680599Z [rank1]:E1204 14:51:49.108000 434182 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7680640Z dist init r=1, world=4 2025-12-04T15:04:54.7680973Z [rank3]:[W1204 14:51:49.717711595 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7681301Z [rank2]:[W1204 14:51:49.839170283 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7681626Z [rank0]:[W1204 14:51:49.009384576 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7681949Z [rank1]:[W1204 14:51:49.019096649 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7682014Z FAILED [47.1717s] [100%] 2025-12-04T15:04:54.7682017Z 2025-12-04T15:04:54.7682072Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7682195Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _ 2025-12-04T15:04:54.7682240Z Traceback (most recent call last): 2025-12-04T15:04:54.7682403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7682448Z self._join_processes(fn) 2025-12-04T15:04:54.7682635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7682691Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7682867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7682913Z raise RuntimeError(error) 2025-12-04T15:04:54.7683003Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7683048Z Traceback (most recent call last): 2025-12-04T15:04:54.7683206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7683249Z getattr(self, test_name)() 2025-12-04T15:04:54.7683405Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7683443Z fn() 2025-12-04T15:04:54.7683591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7683633Z method(*args, **kwargs) 2025-12-04T15:04:54.7683781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7683824Z method(*args, **kwargs) 2025-12-04T15:04:54.7683972Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7684010Z with policy(): 2025-12-04T15:04:54.7684159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7684202Z raise RuntimeError(msg) 2025-12-04T15:04:54.7684577Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7684580Z 2025-12-04T15:04:54.7684655Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7684911Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7684916Z 2025-12-04T15:04:54.7685003Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7685006Z 2025-12-04T15:04:54.7685008Z 2025-12-04T15:04:54.7685083Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7685170Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7685401Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8883f8c783b3fd2a.xml - 2025-12-04T15:04:54.7685461Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7685749Z FAILED [47.1717s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7685805Z Traceback (most recent call last): 2025-12-04T15:04:54.7685968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7686010Z getattr(self, test_name)() 2025-12-04T15:04:54.7686167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7686202Z fn() 2025-12-04T15:04:54.7686367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7686407Z method(*args, **kwargs) 2025-12-04T15:04:54.7686558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7686599Z method(*args, **kwargs) 2025-12-04T15:04:54.7686746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7686792Z with policy(): 2025-12-04T15:04:54.7686942Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7686982Z raise RuntimeError(msg) 2025-12-04T15:04:54.7687359Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7687361Z 2025-12-04T15:04:54.7687437Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7687692Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7687695Z 2025-12-04T15:04:54.7687782Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7687844Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7687909Z ====================== 1 failed, 26 deselected in 47.31s ======================= 2025-12-04T15:04:54.7687945Z Got exit code 1 2025-12-04T15:04:54.7687986Z Retrying single test... 2025-12-04T15:04:54.7688176Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b39e3d98ad9a9065.xml 2025-12-04T15:04:54.7688234Z ============================= test session starts ============================== 2025-12-04T15:04:54.7688344Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7688386Z cachedir: .pytest_cache 2025-12-04T15:04:54.7688544Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7688591Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7688630Z configfile: pytest.ini 2025-12-04T15:04:54.7688791Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7688864Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7689111Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7689154Z Running 1 items in this shard 2025-12-04T15:04:54.7689156Z 2025-12-04T15:04:54.7689483Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 14:52:06.202000 435522 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 435591 2025-12-04T15:04:54.7689661Z I1204 14:52:06.202000 435522 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 435592 2025-12-04T15:04:54.7689810Z I1204 14:52:06.203000 435522 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 435593 2025-12-04T15:04:54.7689959Z I1204 14:52:06.204000 435522 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 435594 2025-12-04T15:04:54.7690578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7690619Z _warn_cpu_init() 2025-12-04T15:04:54.7690926Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7690965Z _init_core_state( 2025-12-04T15:04:54.7691452Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7691514Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7692081Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7692119Z _warn_cpu_init() 2025-12-04T15:04:54.7692410Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7692448Z _init_core_state( 2025-12-04T15:04:54.7692933Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7692994Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7693557Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7693593Z _warn_cpu_init() 2025-12-04T15:04:54.7693887Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7693953Z _init_core_state( 2025-12-04T15:04:54.7694436Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7694494Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7695064Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7695102Z _warn_cpu_init() 2025-12-04T15:04:54.7695596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7695655Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7696136Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7696193Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7696483Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T15:04:54.7696520Z _init_core_state( 2025-12-04T15:04:54.7696999Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7697056Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7697535Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7697593Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7697876Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7697920Z return func(*args, **kwargs) 2025-12-04T15:04:54.7698145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7698201Z return func(*args, **kwargs) 2025-12-04T15:04:54.7698439Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7698480Z return func(*args, **kwargs) 2025-12-04T15:04:54.7698703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7698744Z return func(*args, **kwargs) 2025-12-04T15:04:54.7698973Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7699013Z return func(*args, **kwargs) 2025-12-04T15:04:54.7699230Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7699272Z return func(*args, **kwargs) 2025-12-04T15:04:54.7699505Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7699545Z return func(*args, **kwargs) 2025-12-04T15:04:54.7699762Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7699801Z return func(*args, **kwargs) 2025-12-04T15:04:54.7700017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7700058Z return func(*args, **kwargs) 2025-12-04T15:04:54.7700233Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7700397Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7700687Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7700842Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7701132Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7701256Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7701533Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7701682Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7701957Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7702106Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7702379Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7702543Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7702819Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7702967Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7703482Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7703600Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7703807Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7704190Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7704305Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7704517Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7704681Z [rank0]:E1204 14:52:38.895000 435591 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7704720Z dist init r=0, world=4 2025-12-04T15:04:54.7704859Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7705018Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7705304Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7705459Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7705745Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7705872Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7706151Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7706297Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7706572Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7706727Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7707014Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7707149Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7707433Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7707580Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7708091Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7708207Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7708399Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7708780Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7708893Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7709104Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7709267Z [rank3]:E1204 14:52:38.937000 435594 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7709306Z dist init r=3, world=4 2025-12-04T15:04:54.7709443Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7709601Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7709886Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7710041Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7710360Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7710481Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7710755Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7710916Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7711203Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7711348Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7711620Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7711771Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7712046Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7712207Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7712706Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7712820Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7713013Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7713394Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7713507Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7713714Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7713878Z [rank1]:E1204 14:52:38.969000 435592 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7713916Z dist init r=1, world=4 2025-12-04T15:04:54.7714054Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7714211Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7714495Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7714647Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7714932Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7715053Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7715348Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7715494Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7715766Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7715927Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7716199Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7716346Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7716619Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7716767Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7717266Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7717381Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7717575Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7717952Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7718065Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7718273Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7718437Z [rank2]:E1204 14:52:38.980000 435593 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7718477Z dist init r=2, world=4 2025-12-04T15:04:54.7718813Z [rank0]:[W1204 14:52:39.576956341 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7719144Z [rank3]:[W1204 14:52:39.697735044 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7719470Z [rank2]:[W1204 14:52:39.744451006 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7719813Z [rank1]:[W1204 14:52:39.746762658 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7719853Z FAILED [46.9724s] [100%] 2025-12-04T15:04:54.7719856Z 2025-12-04T15:04:54.7719911Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7720032Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _ 2025-12-04T15:04:54.7720086Z Traceback (most recent call last): 2025-12-04T15:04:54.7720280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7720325Z self._join_processes(fn) 2025-12-04T15:04:54.7720498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7720567Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7720748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7720790Z raise RuntimeError(error) 2025-12-04T15:04:54.7720870Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7720917Z Traceback (most recent call last): 2025-12-04T15:04:54.7721078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7721121Z getattr(self, test_name)() 2025-12-04T15:04:54.7721282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7721319Z fn() 2025-12-04T15:04:54.7721473Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7721515Z method(*args, **kwargs) 2025-12-04T15:04:54.7721667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7721708Z method(*args, **kwargs) 2025-12-04T15:04:54.7721859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7721897Z with policy(): 2025-12-04T15:04:54.7722052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7722094Z raise RuntimeError(msg) 2025-12-04T15:04:54.7722472Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7722476Z 2025-12-04T15:04:54.7722551Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7722807Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7722810Z 2025-12-04T15:04:54.7722899Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7722901Z 2025-12-04T15:04:54.7722903Z 2025-12-04T15:04:54.7722978Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7723068Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7723312Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b39e3d98ad9a9065.xml - 2025-12-04T15:04:54.7723390Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7723659Z FAILED [46.9724s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7723709Z Traceback (most recent call last): 2025-12-04T15:04:54.7723873Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7723918Z getattr(self, test_name)() 2025-12-04T15:04:54.7724089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7724126Z fn() 2025-12-04T15:04:54.7724276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7724321Z method(*args, **kwargs) 2025-12-04T15:04:54.7724480Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7724524Z method(*args, **kwargs) 2025-12-04T15:04:54.7724673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7724713Z with policy(): 2025-12-04T15:04:54.7724861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7724904Z raise RuntimeError(msg) 2025-12-04T15:04:54.7725278Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7725286Z 2025-12-04T15:04:54.7725359Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7725614Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7725616Z 2025-12-04T15:04:54.7725701Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7725764Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7725827Z ====================== 1 failed, 26 deselected in 47.13s ======================= 2025-12-04T15:04:54.7725866Z Got exit code 1 2025-12-04T15:04:54.7726069Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T15:04:54.7726198Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7726384Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76c12a4db26cdcd5.xml 2025-12-04T15:04:54.7726442Z ============================= test session starts ============================== 2025-12-04T15:04:54.7726553Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7726595Z cachedir: .pytest_cache 2025-12-04T15:04:54.7726752Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7726800Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7726841Z configfile: pytest.ini 2025-12-04T15:04:54.7727003Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7727097Z collecting ... collected 60 items / 10 deselected / 50 selected 2025-12-04T15:04:54.7727151Z stepcurrent: skipping 10 already run items. 2025-12-04T15:04:54.7727194Z Running 17 items in this shard 2025-12-04T15:04:54.7727196Z 2025-12-04T15:04:54.7727538Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda I1204 14:52:55.801000 436932 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 437001 2025-12-04T15:04:54.7727696Z I1204 14:52:55.802000 436932 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 437002 2025-12-04T15:04:54.7727858Z I1204 14:52:55.802000 436932 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 437003 2025-12-04T15:04:54.7728009Z I1204 14:52:55.803000 436932 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 437004 2025-12-04T15:04:54.7728599Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7728637Z _warn_cpu_init() 2025-12-04T15:04:54.7728939Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7728979Z _init_core_state( 2025-12-04T15:04:54.7729467Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7729530Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7730094Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7730131Z _warn_cpu_init() 2025-12-04T15:04:54.7730466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7730504Z _init_core_state( 2025-12-04T15:04:54.7730998Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7731059Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7731625Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7731692Z _warn_cpu_init() 2025-12-04T15:04:54.7731990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7732029Z _init_core_state( 2025-12-04T15:04:54.7732529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7732592Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7733171Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7733209Z _warn_cpu_init() 2025-12-04T15:04:54.7733699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7733756Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7734240Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7734298Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7734596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7734634Z _init_core_state( 2025-12-04T15:04:54.7735117Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7735175Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7735464Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7735508Z return func(*args, **kwargs) 2025-12-04T15:04:54.7735991Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7736071Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7736297Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7736338Z return func(*args, **kwargs) 2025-12-04T15:04:54.7736560Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7736601Z return func(*args, **kwargs) 2025-12-04T15:04:54.7736831Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7736873Z return func(*args, **kwargs) 2025-12-04T15:04:54.7737091Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7737133Z return func(*args, **kwargs) 2025-12-04T15:04:54.7737359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7737400Z return func(*args, **kwargs) 2025-12-04T15:04:54.7737619Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7737660Z return func(*args, **kwargs) 2025-12-04T15:04:54.7737881Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7737922Z return func(*args, **kwargs) 2025-12-04T15:04:54.7738142Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7738182Z return func(*args, **kwargs) 2025-12-04T15:04:54.7738326Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7738488Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7738778Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7738936Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7739221Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7739346Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7739622Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7739771Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7740049Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7740269Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7740543Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7740678Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7740968Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7741116Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7741647Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7741762Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7741960Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7742357Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7742473Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7742683Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7742846Z [rank3]:E1204 14:53:28.703000 437004 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7742886Z dist init r=3, world=4 2025-12-04T15:04:54.7743024Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7743183Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7743468Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7743622Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7743904Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7744028Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7744305Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7744478Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7744755Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7744901Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7745184Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7745319Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7745596Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7745756Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7746264Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7746378Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7746573Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7746970Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7747082Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7747294Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7747458Z [rank1]:E1204 14:53:28.705000 437002 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7747497Z dist init r=1, world=4 2025-12-04T15:04:54.7747637Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7747795Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7748080Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7748233Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7748515Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7748653Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7748937Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7749084Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7749373Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7749521Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7749797Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7749945Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7750259Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7750409Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7750915Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7751032Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7751227Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7751619Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7751733Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7751941Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7752106Z [rank0]:E1204 14:53:28.755000 437001 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7752145Z dist init r=0, world=4 2025-12-04T15:04:54.7752285Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7752441Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7752726Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7752893Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7753187Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7753312Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7753584Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7753744Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7754019Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7754181Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7754454Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7754592Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7754872Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7755021Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7755539Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7755654Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7755851Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7756247Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7756364Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7756575Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7756738Z [rank2]:E1204 14:53:28.761000 437003 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7756778Z dist init r=2, world=4 2025-12-04T15:04:54.7757112Z [rank3]:[W1204 14:53:28.394807937 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7757459Z [rank1]:[W1204 14:53:28.396468416 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7757783Z [rank0]:[W1204 14:53:29.536405061 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7758117Z [rank2]:[W1204 14:53:29.619480963 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7758161Z FAILED [47.2739s] [ 5%] 2025-12-04T15:04:54.7758163Z 2025-12-04T15:04:54.7758222Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7758363Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda _ 2025-12-04T15:04:54.7758409Z Traceback (most recent call last): 2025-12-04T15:04:54.7758574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7758617Z self._join_processes(fn) 2025-12-04T15:04:54.7758789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7758842Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7759018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7759062Z raise RuntimeError(error) 2025-12-04T15:04:54.7759141Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7759185Z Traceback (most recent call last): 2025-12-04T15:04:54.7759344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7759386Z getattr(self, test_name)() 2025-12-04T15:04:54.7759541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7759577Z fn() 2025-12-04T15:04:54.7759726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7759769Z method(*args, **kwargs) 2025-12-04T15:04:54.7759922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7759962Z method(*args, **kwargs) 2025-12-04T15:04:54.7760115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7760154Z with policy(): 2025-12-04T15:04:54.7760352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7760394Z raise RuntimeError(msg) 2025-12-04T15:04:54.7760779Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7760781Z 2025-12-04T15:04:54.7760858Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7761124Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7761153Z 2025-12-04T15:04:54.7761241Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7761243Z 2025-12-04T15:04:54.7761246Z 2025-12-04T15:04:54.7761320Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7761409Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7761641Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76c12a4db26cdcd5.xml - 2025-12-04T15:04:54.7761701Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7761995Z FAILED [47.2739s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7762043Z Traceback (most recent call last): 2025-12-04T15:04:54.7762205Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7762267Z getattr(self, test_name)() 2025-12-04T15:04:54.7762426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7762462Z fn() 2025-12-04T15:04:54.7762612Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7762655Z method(*args, **kwargs) 2025-12-04T15:04:54.7762806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7762850Z method(*args, **kwargs) 2025-12-04T15:04:54.7762999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7763038Z with policy(): 2025-12-04T15:04:54.7763188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7763233Z raise RuntimeError(msg) 2025-12-04T15:04:54.7763620Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7763622Z 2025-12-04T15:04:54.7763695Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7763963Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7763966Z 2025-12-04T15:04:54.7764052Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7764118Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7764181Z ====================== 1 failed, 10 deselected in 47.44s ======================= 2025-12-04T15:04:54.7764222Z Got exit code 1 2025-12-04T15:04:54.7764261Z Retrying single test... 2025-12-04T15:04:54.7764450Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f364db0ee0104d7c.xml 2025-12-04T15:04:54.7764506Z ============================= test session starts ============================== 2025-12-04T15:04:54.7764619Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7764660Z cachedir: .pytest_cache 2025-12-04T15:04:54.7764817Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7764885Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7764927Z configfile: pytest.ini 2025-12-04T15:04:54.7765088Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7765165Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7765425Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7765469Z Running 1 items in this shard 2025-12-04T15:04:54.7765471Z 2025-12-04T15:04:54.7765821Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda I1204 14:53:45.638000 438342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 438411 2025-12-04T15:04:54.7765978Z I1204 14:53:45.639000 438342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 438412 2025-12-04T15:04:54.7766140Z I1204 14:53:45.639000 438342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 438413 2025-12-04T15:04:54.7766288Z I1204 14:53:45.640000 438342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 438414 2025-12-04T15:04:54.7766858Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7766896Z _warn_cpu_init() 2025-12-04T15:04:54.7767196Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7767235Z _init_core_state( 2025-12-04T15:04:54.7767721Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7767785Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7768354Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7768394Z _warn_cpu_init() 2025-12-04T15:04:54.7768689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7768728Z _init_core_state( 2025-12-04T15:04:54.7769214Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7769291Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7769854Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7769890Z _warn_cpu_init() 2025-12-04T15:04:54.7770234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7770272Z _init_core_state( 2025-12-04T15:04:54.7770768Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7770830Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7771389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7771427Z _warn_cpu_init() 2025-12-04T15:04:54.7771911Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7771968Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7772448Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7772505Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7772806Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7772843Z _init_core_state( 2025-12-04T15:04:54.7773324Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7773381Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7773668Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7773735Z return func(*args, **kwargs) 2025-12-04T15:04:54.7774216Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7774273Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7774514Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7774557Z return func(*args, **kwargs) 2025-12-04T15:04:54.7774779Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7774822Z return func(*args, **kwargs) 2025-12-04T15:04:54.7775052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7775095Z return func(*args, **kwargs) 2025-12-04T15:04:54.7775312Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7775353Z return func(*args, **kwargs) 2025-12-04T15:04:54.7775568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7775610Z return func(*args, **kwargs) 2025-12-04T15:04:54.7775827Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7775869Z return func(*args, **kwargs) 2025-12-04T15:04:54.7776084Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7776124Z return func(*args, **kwargs) 2025-12-04T15:04:54.7776340Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7776379Z return func(*args, **kwargs) 2025-12-04T15:04:54.7776525Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7776685Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7776974Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7777127Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7777412Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7777537Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7777812Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7777981Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7778256Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7778402Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7778692Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7778831Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7779118Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7779268Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7779787Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7779902Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7780100Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7780520Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7780636Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7780846Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7781013Z [rank2]:E1204 14:54:18.457000 438413 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7781054Z dist init r=2, world=4 2025-12-04T15:04:54.7781194Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7781354Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7781642Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7781798Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7782084Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7782233Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7782507Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7782655Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7782943Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7783091Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7783378Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7783517Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7783797Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7783946Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7784459Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7784575Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7784771Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7785165Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7785281Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7785494Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7785657Z [rank3]:E1204 14:54:18.458000 438414 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7785699Z dist init r=3, world=4 2025-12-04T15:04:54.7785837Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7785998Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7786285Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7786461Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7786742Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7786866Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7787150Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7787296Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7787572Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7787733Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7788009Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7788144Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7788423Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7788572Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7789079Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7789193Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7789387Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7789778Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7789891Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7790101Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7790294Z [rank1]:E1204 14:54:18.464000 438412 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7790336Z dist init r=1, world=4 2025-12-04T15:04:54.7790473Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7790645Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7790949Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7791102Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7791400Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7791522Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7791801Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7791963Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7792238Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7792386Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7792660Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7792797Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7793074Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7793222Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7793729Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7793844Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7794038Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7794430Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7794545Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7794753Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7794930Z [rank0]:E1204 14:54:18.509000 438411 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7794985Z dist init r=0, world=4 2025-12-04T15:04:54.7795322Z [rank1]:[W1204 14:54:18.158014705 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7795660Z [rank3]:[W1204 14:54:18.177857290 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7795987Z [rank2]:[W1204 14:54:18.183197484 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7796325Z [rank0]:[W1204 14:54:18.287366677 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7796366Z FAILED [47.2736s] [100%] 2025-12-04T15:04:54.7796368Z 2025-12-04T15:04:54.7796428Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7796559Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda _ 2025-12-04T15:04:54.7796607Z Traceback (most recent call last): 2025-12-04T15:04:54.7796770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7796815Z self._join_processes(fn) 2025-12-04T15:04:54.7796988Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7797043Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7797220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7797266Z raise RuntimeError(error) 2025-12-04T15:04:54.7797344Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7797391Z Traceback (most recent call last): 2025-12-04T15:04:54.7797550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7797594Z getattr(self, test_name)() 2025-12-04T15:04:54.7797752Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7797790Z fn() 2025-12-04T15:04:54.7797941Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7797985Z method(*args, **kwargs) 2025-12-04T15:04:54.7798135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7798177Z method(*args, **kwargs) 2025-12-04T15:04:54.7798325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7798364Z with policy(): 2025-12-04T15:04:54.7798514Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7798557Z raise RuntimeError(msg) 2025-12-04T15:04:54.7798942Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7798965Z 2025-12-04T15:04:54.7799040Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7799306Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7799309Z 2025-12-04T15:04:54.7799396Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7799398Z 2025-12-04T15:04:54.7799457Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7799510Z Traceback (most recent call last): 2025-12-04T15:04:54.7799672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7799715Z getattr(self, test_name)() 2025-12-04T15:04:54.7799875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7799911Z fn() 2025-12-04T15:04:54.7800072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7800113Z method(*args, **kwargs) 2025-12-04T15:04:54.7800299Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7800339Z method(*args, **kwargs) 2025-12-04T15:04:54.7800490Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7800528Z with policy(): 2025-12-04T15:04:54.7800681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7800723Z raise RuntimeError(msg) 2025-12-04T15:04:54.7801107Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7801110Z 2025-12-04T15:04:54.7801187Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7801449Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7801451Z 2025-12-04T15:04:54.7801540Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7801542Z 2025-12-04T15:04:54.7801600Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7801645Z Traceback (most recent call last): 2025-12-04T15:04:54.7801806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7801849Z getattr(self, test_name)() 2025-12-04T15:04:54.7802005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7802041Z fn() 2025-12-04T15:04:54.7802190Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7802231Z method(*args, **kwargs) 2025-12-04T15:04:54.7802379Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7802419Z method(*args, **kwargs) 2025-12-04T15:04:54.7802567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7802621Z with policy(): 2025-12-04T15:04:54.7802784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7802826Z raise RuntimeError(msg) 2025-12-04T15:04:54.7803208Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7803211Z 2025-12-04T15:04:54.7803284Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7803741Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7803744Z 2025-12-04T15:04:54.7803831Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7803834Z 2025-12-04T15:04:54.7803836Z 2025-12-04T15:04:54.7803915Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7804018Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7804253Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f364db0ee0104d7c.xml - 2025-12-04T15:04:54.7804313Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7804592Z FAILED [47.2736s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7804638Z Traceback (most recent call last): 2025-12-04T15:04:54.7804807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7804853Z getattr(self, test_name)() 2025-12-04T15:04:54.7805011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7805048Z fn() 2025-12-04T15:04:54.7805197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7805238Z method(*args, **kwargs) 2025-12-04T15:04:54.7805388Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7805428Z method(*args, **kwargs) 2025-12-04T15:04:54.7805582Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7805618Z with policy(): 2025-12-04T15:04:54.7805769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7805812Z raise RuntimeError(msg) 2025-12-04T15:04:54.7806195Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7806197Z 2025-12-04T15:04:54.7806273Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7806538Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7806540Z 2025-12-04T15:04:54.7806630Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7806650Z 2025-12-04T15:04:54.7806709Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7806770Z Traceback (most recent call last): 2025-12-04T15:04:54.7806932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7806977Z getattr(self, test_name)() 2025-12-04T15:04:54.7807135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7807171Z fn() 2025-12-04T15:04:54.7807323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7807374Z method(*args, **kwargs) 2025-12-04T15:04:54.7807526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7807564Z method(*args, **kwargs) 2025-12-04T15:04:54.7807716Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7807754Z with policy(): 2025-12-04T15:04:54.7807916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7807958Z raise RuntimeError(msg) 2025-12-04T15:04:54.7808340Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7808342Z 2025-12-04T15:04:54.7808416Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7808682Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7808685Z 2025-12-04T15:04:54.7808771Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7808773Z 2025-12-04T15:04:54.7808836Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7808881Z Traceback (most recent call last): 2025-12-04T15:04:54.7809043Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7809085Z getattr(self, test_name)() 2025-12-04T15:04:54.7809243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7809280Z fn() 2025-12-04T15:04:54.7809430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7809471Z method(*args, **kwargs) 2025-12-04T15:04:54.7809621Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7809664Z method(*args, **kwargs) 2025-12-04T15:04:54.7809813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7809852Z with policy(): 2025-12-04T15:04:54.7810001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7810044Z raise RuntimeError(msg) 2025-12-04T15:04:54.7810456Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7810478Z 2025-12-04T15:04:54.7810556Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7810836Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7810838Z 2025-12-04T15:04:54.7810924Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7810987Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7811053Z ====================== 1 failed, 26 deselected in 47.44s ======================= 2025-12-04T15:04:54.7811091Z Got exit code 1 2025-12-04T15:04:54.7811132Z Retrying single test... 2025-12-04T15:04:54.7811335Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4261ae837511da77.xml 2025-12-04T15:04:54.7811393Z ============================= test session starts ============================== 2025-12-04T15:04:54.7811507Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7811552Z cachedir: .pytest_cache 2025-12-04T15:04:54.7811720Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7811769Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7811808Z configfile: pytest.ini 2025-12-04T15:04:54.7811969Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7812044Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7812308Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7812354Z Running 1 items in this shard 2025-12-04T15:04:54.7812357Z 2025-12-04T15:04:54.7812695Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda I1204 14:54:35.341000 439752 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 439821 2025-12-04T15:04:54.7812852Z I1204 14:54:35.342000 439752 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 439822 2025-12-04T15:04:54.7813000Z I1204 14:54:35.342000 439752 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 439823 2025-12-04T15:04:54.7813151Z I1204 14:54:35.343000 439752 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 439824 2025-12-04T15:04:54.7813726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7813767Z _warn_cpu_init() 2025-12-04T15:04:54.7814068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7814106Z _init_core_state( 2025-12-04T15:04:54.7814596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7814682Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7815249Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7815287Z _warn_cpu_init() 2025-12-04T15:04:54.7815597Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7815636Z _init_core_state( 2025-12-04T15:04:54.7816132Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7816195Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7816761Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7816801Z _warn_cpu_init() 2025-12-04T15:04:54.7817099Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7817138Z _init_core_state( 2025-12-04T15:04:54.7817632Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7817691Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7818259Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7818298Z _warn_cpu_init() 2025-12-04T15:04:54.7818790Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7818849Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7819336Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7819428Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7819907Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7819976Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7820312Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T15:04:54.7820354Z _init_core_state( 2025-12-04T15:04:54.7820853Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.7820912Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.7821202Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7821244Z return func(*args, **kwargs) 2025-12-04T15:04:54.7821471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7821515Z return func(*args, **kwargs) 2025-12-04T15:04:54.7821739Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7821781Z return func(*args, **kwargs) 2025-12-04T15:04:54.7822000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7822040Z return func(*args, **kwargs) 2025-12-04T15:04:54.7822259Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7822300Z return func(*args, **kwargs) 2025-12-04T15:04:54.7822523Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7822564Z return func(*args, **kwargs) 2025-12-04T15:04:54.7822786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7822826Z return func(*args, **kwargs) 2025-12-04T15:04:54.7823045Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7823087Z return func(*args, **kwargs) 2025-12-04T15:04:54.7823305Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7823360Z return func(*args, **kwargs) 2025-12-04T15:04:54.7823520Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7823685Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7823969Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7824124Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7824418Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7824543Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7824844Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7824994Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7825274Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7825419Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7825695Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7825833Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7826110Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7826258Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7826769Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7826887Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7827082Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7827480Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7827595Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7827818Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7827992Z [rank2]:E1204 14:55:08.189000 439823 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7828035Z dist init r=2, world=4 2025-12-04T15:04:54.7828172Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7828331Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7828626Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7828781Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7829075Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7829197Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7829473Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7829621Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7829898Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7830044Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7830354Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7830489Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7830766Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7830915Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7831424Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T15:04:54.7831539Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7831733Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7832128Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7832274Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7832484Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7832646Z [rank0]:E1204 14:55:08.199000 439821 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7832697Z dist init r=0, world=4 2025-12-04T15:04:54.7832835Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7832993Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7833292Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7833447Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7833730Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7833851Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7834126Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7834277Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7834552Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7834698Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7834972Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7835108Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7835386Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7835533Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7836040Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T15:04:54.7838852Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7839074Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7839471Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7839586Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7839809Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7839975Z [rank1]:E1204 14:55:08.200000 439822 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7840019Z dist init r=1, world=4 2025-12-04T15:04:54.7840208Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7840367Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7840654Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7840812Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7841095Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7841221Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7841497Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7841644Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7841922Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7842069Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7842347Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7842481Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7842758Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7842907Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7843416Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T15:04:54.7843562Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7843757Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7844175Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7844289Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7844512Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7844674Z [rank3]:E1204 14:55:08.246000 439824 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7844714Z dist init r=3, world=4 2025-12-04T15:04:54.7845050Z [rank2]:[W1204 14:55:08.865195603 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7845376Z [rank0]:[W1204 14:55:08.892739520 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7845701Z [rank1]:[W1204 14:55:08.915921662 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7846026Z [rank3]:[W1204 14:55:08.052554791 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7846068Z FAILED [47.0719s] [100%] 2025-12-04T15:04:54.7846070Z 2025-12-04T15:04:54.7846129Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7846261Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda _ 2025-12-04T15:04:54.7846310Z Traceback (most recent call last): 2025-12-04T15:04:54.7846475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7846520Z self._join_processes(fn) 2025-12-04T15:04:54.7846692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7846745Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7846922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7846966Z raise RuntimeError(error) 2025-12-04T15:04:54.7847047Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7847092Z Traceback (most recent call last): 2025-12-04T15:04:54.7847252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7847314Z getattr(self, test_name)() 2025-12-04T15:04:54.7847471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7847507Z fn() 2025-12-04T15:04:54.7847657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7847698Z method(*args, **kwargs) 2025-12-04T15:04:54.7847847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7847886Z method(*args, **kwargs) 2025-12-04T15:04:54.7848045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7848082Z with policy(): 2025-12-04T15:04:54.7848233Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7848276Z raise RuntimeError(msg) 2025-12-04T15:04:54.7848676Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7848679Z 2025-12-04T15:04:54.7848755Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7849023Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7849026Z 2025-12-04T15:04:54.7849115Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7849118Z 2025-12-04T15:04:54.7849120Z 2025-12-04T15:04:54.7849198Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7849287Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7849521Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4261ae837511da77.xml - 2025-12-04T15:04:54.7849582Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7849862Z FAILED [47.0719s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.7849909Z Traceback (most recent call last): 2025-12-04T15:04:54.7850073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7850116Z getattr(self, test_name)() 2025-12-04T15:04:54.7850313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7850349Z fn() 2025-12-04T15:04:54.7850503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7850544Z method(*args, **kwargs) 2025-12-04T15:04:54.7850693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7850732Z method(*args, **kwargs) 2025-12-04T15:04:54.7850881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7850917Z with policy(): 2025-12-04T15:04:54.7851066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7851123Z raise RuntimeError(msg) 2025-12-04T15:04:54.7851524Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T15:04:54.7851526Z 2025-12-04T15:04:54.7851599Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7851865Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7851867Z 2025-12-04T15:04:54.7851967Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7852030Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7852094Z ====================== 1 failed, 26 deselected in 47.23s ======================= 2025-12-04T15:04:54.7852133Z Got exit code 1 2025-12-04T15:04:54.7852361Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2025-12-04T15:04:54.7852490Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7852679Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7a80ab2bccd929ba.xml 2025-12-04T15:04:54.7852737Z ============================= test session starts ============================== 2025-12-04T15:04:54.7852856Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7852897Z cachedir: .pytest_cache 2025-12-04T15:04:54.7853056Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7853105Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7853146Z configfile: pytest.ini 2025-12-04T15:04:54.7853310Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7853384Z collecting ... collected 60 items / 11 deselected / 49 selected 2025-12-04T15:04:54.7853436Z stepcurrent: skipping 11 already run items. 2025-12-04T15:04:54.7853479Z Running 16 items in this shard 2025-12-04T15:04:54.7853481Z 2025-12-04T15:04:54.7853791Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda I1204 14:55:25.168000 441162 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 441231 2025-12-04T15:04:54.7853944Z I1204 14:55:25.169000 441162 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 441232 2025-12-04T15:04:54.7854096Z I1204 14:55:25.170000 441162 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 441233 2025-12-04T15:04:54.7854247Z I1204 14:55:25.170000 441162 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 441234 2025-12-04T15:04:54.7854822Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7854861Z _warn_cpu_init() 2025-12-04T15:04:54.7855429Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7855487Z _warn_cpu_init() 2025-12-04T15:04:54.7856054Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7856090Z _warn_cpu_init() 2025-12-04T15:04:54.7856656Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7856696Z _warn_cpu_init() 2025-12-04T15:04:54.7856985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7857027Z return func(*args, **kwargs) 2025-12-04T15:04:54.7857169Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7857331Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7857625Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7857781Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7858065Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7858190Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7858468Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7858617Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7858892Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7859037Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7859311Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7859458Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7859755Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7859904Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7860432Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.7860548Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7860743Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7861118Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7861231Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7861441Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7861604Z [rank0]:E1204 14:55:32.994000 441231 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7861645Z dist init r=0, world=4 2025-12-04T15:04:54.7861782Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7861941Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7862227Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7862380Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7862663Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7862786Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7863064Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7863209Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7863485Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7863631Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7863917Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7864067Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7864344Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7864501Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7864979Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T15:04:54.7865105Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7865299Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7865659Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7865772Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7865982Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7866147Z [rank2]:E1204 14:55:32.997000 441233 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7866185Z dist init r=2, world=4 2025-12-04T15:04:54.7866322Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7866480Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7866768Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7866919Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7867204Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7867327Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7867599Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7867746Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7868023Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7868190Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7868464Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7868598Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7868882Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7869030Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7869518Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.7869631Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7869827Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7870224Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7870339Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7870549Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7870711Z [rank1]:E1204 14:55:33.001000 441232 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7870750Z dist init r=1, world=4 2025-12-04T15:04:54.7870886Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7871043Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7871330Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7871483Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7871764Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7871888Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7872161Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7872335Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7872613Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7872757Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7873044Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7873178Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7873468Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7873614Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7874090Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.7874204Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7874398Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7874761Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7874874Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7875085Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7875246Z [rank3]:E1204 14:55:33.042000 441234 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7875286Z dist init r=3, world=4 2025-12-04T15:04:54.7875619Z [rank0]:[W1204 14:55:33.674139512 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7875658Z FAILED [9.7210s] [ 6%] 2025-12-04T15:04:54.7875660Z 2025-12-04T15:04:54.7875715Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7875816Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda __ 2025-12-04T15:04:54.7875861Z Traceback (most recent call last): 2025-12-04T15:04:54.7876022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7876067Z self._join_processes(fn) 2025-12-04T15:04:54.7876237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7876318Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7876494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7876538Z raise RuntimeError(error) 2025-12-04T15:04:54.7876617Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7876662Z Traceback (most recent call last): 2025-12-04T15:04:54.7876822Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7876864Z getattr(self, test_name)() 2025-12-04T15:04:54.7877030Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7877066Z fn() 2025-12-04T15:04:54.7877216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7877258Z method(*args, **kwargs) 2025-12-04T15:04:54.7877416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7877459Z method(*args, **kwargs) 2025-12-04T15:04:54.7877607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7877644Z with policy(): 2025-12-04T15:04:54.7877792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7877833Z raise RuntimeError(msg) 2025-12-04T15:04:54.7878187Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.7878191Z 2025-12-04T15:04:54.7878265Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7878499Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7878503Z 2025-12-04T15:04:54.7878590Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7878592Z 2025-12-04T15:04:54.7878594Z 2025-12-04T15:04:54.7878668Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7878755Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7878986Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7a80ab2bccd929ba.xml - 2025-12-04T15:04:54.7879046Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7879305Z FAILED [9.7210s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7879350Z Traceback (most recent call last): 2025-12-04T15:04:54.7879515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7879557Z getattr(self, test_name)() 2025-12-04T15:04:54.7879714Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7879749Z fn() 2025-12-04T15:04:54.7879903Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7879943Z method(*args, **kwargs) 2025-12-04T15:04:54.7880105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7880156Z method(*args, **kwargs) 2025-12-04T15:04:54.7880346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7880382Z with policy(): 2025-12-04T15:04:54.7880534Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7880575Z raise RuntimeError(msg) 2025-12-04T15:04:54.7880945Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.7880947Z 2025-12-04T15:04:54.7881023Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7881258Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7881261Z 2025-12-04T15:04:54.7881361Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7881423Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7881488Z ======================= 1 failed, 11 deselected in 9.89s ======================= 2025-12-04T15:04:54.7881526Z Got exit code 1 2025-12-04T15:04:54.7881568Z Retrying single test... 2025-12-04T15:04:54.7881756Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-21cba412d5b4143e.xml 2025-12-04T15:04:54.7881814Z ============================= test session starts ============================== 2025-12-04T15:04:54.7881927Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7881969Z cachedir: .pytest_cache 2025-12-04T15:04:54.7882125Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7882171Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7882211Z configfile: pytest.ini 2025-12-04T15:04:54.7882373Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7882448Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7882677Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7882720Z Running 1 items in this shard 2025-12-04T15:04:54.7882722Z 2025-12-04T15:04:54.7883036Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda I1204 14:55:37.417000 441564 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 441633 2025-12-04T15:04:54.7883192Z I1204 14:55:37.418000 441564 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 441634 2025-12-04T15:04:54.7883347Z I1204 14:55:37.419000 441564 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 441635 2025-12-04T15:04:54.7883500Z I1204 14:55:37.419000 441564 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 441636 2025-12-04T15:04:54.7884090Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7884155Z _warn_cpu_init() 2025-12-04T15:04:54.7884719Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7884757Z _warn_cpu_init() 2025-12-04T15:04:54.7885330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7885381Z _warn_cpu_init() 2025-12-04T15:04:54.7885943Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7885980Z _warn_cpu_init() 2025-12-04T15:04:54.7886272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7886316Z return func(*args, **kwargs) 2025-12-04T15:04:54.7886463Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7886626Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7886916Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7887073Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7887357Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7887482Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7887757Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7887906Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7888182Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7888331Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7888628Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7888763Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7889042Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7889199Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7889679Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.7889812Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7890009Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7890408Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7890520Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7890732Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7890898Z [rank1]:E1204 14:55:45.185000 441634 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7890937Z dist init r=1, world=4 2025-12-04T15:04:54.7891075Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7891232Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7891517Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7891671Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7891956Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7892078Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7892355Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7892500Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7892788Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7892948Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7893223Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7893357Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7893645Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7893793Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7894287Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T15:04:54.7894402Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7894595Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7894956Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7895069Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7895280Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7895442Z [rank2]:E1204 14:55:45.190000 441635 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7895482Z dist init r=2, world=4 2025-12-04T15:04:54.7895619Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7895779Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7896065Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7896216Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7896497Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7896621Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7896897Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7897064Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7897340Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7897487Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7897771Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7897909Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7898196Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7898345Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7898821Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.7898937Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7899131Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7899492Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7899605Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7899814Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7899977Z [rank0]:E1204 14:55:45.239000 441633 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7900016Z dist init r=0, world=4 2025-12-04T15:04:54.7900155Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7900352Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7900636Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7900788Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7901074Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7901223Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7901497Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7901642Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7901932Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7902078Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7902353Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7902503Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7902778Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7902928Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7903407Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.7903520Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7903715Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7904075Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7904186Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7904395Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7904559Z [rank3]:E1204 14:55:45.246000 441636 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7904598Z dist init r=3, world=4 2025-12-04T15:04:54.7904933Z [rank0]:[W1204 14:55:45.014378468 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7904972Z FAILED [9.7197s] [100%] 2025-12-04T15:04:54.7904974Z 2025-12-04T15:04:54.7905030Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7905131Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda __ 2025-12-04T15:04:54.7905190Z Traceback (most recent call last): 2025-12-04T15:04:54.7905362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7905405Z self._join_processes(fn) 2025-12-04T15:04:54.7905578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7905630Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7905808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7905850Z raise RuntimeError(error) 2025-12-04T15:04:54.7905944Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7905989Z Traceback (most recent call last): 2025-12-04T15:04:54.7906148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7906190Z getattr(self, test_name)() 2025-12-04T15:04:54.7906348Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7906391Z fn() 2025-12-04T15:04:54.7906540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7906582Z method(*args, **kwargs) 2025-12-04T15:04:54.7906731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7906771Z method(*args, **kwargs) 2025-12-04T15:04:54.7906921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7906959Z with policy(): 2025-12-04T15:04:54.7907110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7907152Z raise RuntimeError(msg) 2025-12-04T15:04:54.7907509Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.7907512Z 2025-12-04T15:04:54.7907587Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7907820Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7907822Z 2025-12-04T15:04:54.7907912Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7907914Z 2025-12-04T15:04:54.7907915Z 2025-12-04T15:04:54.7907989Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7908077Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7908309Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-21cba412d5b4143e.xml - 2025-12-04T15:04:54.7908369Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7908618Z FAILED [9.7197s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.7908663Z Traceback (most recent call last): 2025-12-04T15:04:54.7908826Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7908868Z getattr(self, test_name)() 2025-12-04T15:04:54.7909025Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7909080Z fn() 2025-12-04T15:04:54.7909231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7909271Z method(*args, **kwargs) 2025-12-04T15:04:54.7909423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7909463Z method(*args, **kwargs) 2025-12-04T15:04:54.7909612Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7909649Z with policy(): 2025-12-04T15:04:54.7909809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7909850Z raise RuntimeError(msg) 2025-12-04T15:04:54.7910249Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.7910268Z 2025-12-04T15:04:54.7910341Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7910575Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7910577Z 2025-12-04T15:04:54.7910662Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7910727Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7910787Z ======================= 1 failed, 26 deselected in 9.88s ======================= 2025-12-04T15:04:54.7910826Z Got exit code 1 2025-12-04T15:04:54.7910867Z Retrying single test... 2025-12-04T15:04:54.7911054Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ab9eb51514ba22c4.xml 2025-12-04T15:04:54.7911112Z ============================= test session starts ============================== 2025-12-04T15:04:54.7911224Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7911264Z cachedir: .pytest_cache 2025-12-04T15:04:54.7911420Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7911465Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7911506Z configfile: pytest.ini 2025-12-04T15:04:54.7911670Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7911744Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7911974Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7912017Z Running 1 items in this shard 2025-12-04T15:04:54.7912020Z 2025-12-04T15:04:54.7912327Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda I1204 14:55:49.731000 441966 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 442035 2025-12-04T15:04:54.7912478Z I1204 14:55:49.731000 441966 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 442036 2025-12-04T15:04:54.7912630Z I1204 14:55:49.732000 441966 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 442037 2025-12-04T15:04:54.7912779Z I1204 14:55:49.733000 441966 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 442038 2025-12-04T15:04:54.7913370Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7913422Z _warn_cpu_init() 2025-12-04T15:04:54.7914001Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7914041Z _warn_cpu_init() 2025-12-04T15:04:54.7914612Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7914651Z _warn_cpu_init() 2025-12-04T15:04:54.7915215Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7915253Z _warn_cpu_init() 2025-12-04T15:04:54.7915544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7915587Z return func(*args, **kwargs) 2025-12-04T15:04:54.7915729Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7915891Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7916183Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7916337Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7916620Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7916743Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7917020Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7917168Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7917453Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7917610Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7917884Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7918028Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7918310Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7918459Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7918947Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.7919063Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7919258Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7919617Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7919732Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7919940Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7920103Z [rank0]:E1204 14:55:57.564000 442035 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7920144Z dist init r=0, world=4 2025-12-04T15:04:54.7920318Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7920475Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7920765Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7920917Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7921198Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7921323Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7921597Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7921776Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7922050Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7922196Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7922480Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7922618Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7922910Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7923056Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7923534Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.7923648Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7923842Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7924201Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7924313Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7924524Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7924685Z [rank3]:E1204 14:55:57.572000 442038 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7924727Z dist init r=3, world=4 2025-12-04T15:04:54.7924862Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7925022Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7925310Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7925463Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7925744Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7925888Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7926164Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7926309Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7926593Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7926738Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7927021Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7927157Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7927434Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7927581Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7928062Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.7928177Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7928369Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7928732Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7928843Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7929054Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7929218Z [rank1]:E1204 14:55:57.595000 442036 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7929257Z dist init r=1, world=4 2025-12-04T15:04:54.7929393Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7929551Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7929840Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7930001Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7930328Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7930450Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7930737Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7930883Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7931156Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7931315Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7931588Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7931723Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7932000Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7932149Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7932626Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T15:04:54.7932740Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7932933Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7933292Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7933406Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7933613Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7933774Z [rank2]:E1204 14:55:57.611000 442037 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7933812Z dist init r=2, world=4 2025-12-04T15:04:54.7934145Z [rank0]:[W1204 14:55:57.248104406 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7934196Z FAILED [9.8204s] [100%] 2025-12-04T15:04:54.7934211Z 2025-12-04T15:04:54.7934266Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7934368Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda __ 2025-12-04T15:04:54.7934414Z Traceback (most recent call last): 2025-12-04T15:04:54.7934575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7934618Z self._join_processes(fn) 2025-12-04T15:04:54.7934788Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7934851Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7935027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7935071Z raise RuntimeError(error) 2025-12-04T15:04:54.7935149Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7935194Z Traceback (most recent call last): 2025-12-04T15:04:54.7935361Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7935404Z getattr(self, test_name)() 2025-12-04T15:04:54.7935559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7935594Z fn() 2025-12-04T15:04:54.7935742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7935783Z method(*args, **kwargs) 2025-12-04T15:04:54.7935936Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7935977Z method(*args, **kwargs) 2025-12-04T15:04:54.7936126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7936162Z with policy(): 2025-12-04T15:04:54.7936315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7936355Z raise RuntimeError(msg) 2025-12-04T15:04:54.7936710Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.7936713Z 2025-12-04T15:04:54.7936786Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7937019Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7937023Z 2025-12-04T15:04:54.7937110Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7937112Z 2025-12-04T15:04:54.7937172Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7937215Z Traceback (most recent call last): 2025-12-04T15:04:54.7937376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7937420Z getattr(self, test_name)() 2025-12-04T15:04:54.7937579Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7937613Z fn() 2025-12-04T15:04:54.7937764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7937819Z method(*args, **kwargs) 2025-12-04T15:04:54.7937969Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7938019Z method(*args, **kwargs) 2025-12-04T15:04:54.7938170Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7938207Z with policy(): 2025-12-04T15:04:54.7938357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7938398Z raise RuntimeError(msg) 2025-12-04T15:04:54.7938763Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.7938765Z 2025-12-04T15:04:54.7938840Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7939073Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7939084Z 2025-12-04T15:04:54.7939171Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7939173Z 2025-12-04T15:04:54.7939175Z 2025-12-04T15:04:54.7939249Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7939336Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7939568Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ab9eb51514ba22c4.xml - 2025-12-04T15:04:54.7939628Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7939877Z FAILED [9.8204s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7939924Z Traceback (most recent call last): 2025-12-04T15:04:54.7940086Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7940128Z getattr(self, test_name)() 2025-12-04T15:04:54.7940325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7940359Z fn() 2025-12-04T15:04:54.7940511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7940550Z method(*args, **kwargs) 2025-12-04T15:04:54.7940699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7940738Z method(*args, **kwargs) 2025-12-04T15:04:54.7940887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7940923Z with policy(): 2025-12-04T15:04:54.7941074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7941114Z raise RuntimeError(msg) 2025-12-04T15:04:54.7941467Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.7941470Z 2025-12-04T15:04:54.7941541Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7941773Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7941804Z 2025-12-04T15:04:54.7941890Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7941893Z 2025-12-04T15:04:54.7941950Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.7941995Z Traceback (most recent call last): 2025-12-04T15:04:54.7942155Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7942197Z getattr(self, test_name)() 2025-12-04T15:04:54.7942353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7942399Z fn() 2025-12-04T15:04:54.7942548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7942589Z method(*args, **kwargs) 2025-12-04T15:04:54.7942737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7942777Z method(*args, **kwargs) 2025-12-04T15:04:54.7942938Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7942976Z with policy(): 2025-12-04T15:04:54.7943125Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7943166Z raise RuntimeError(msg) 2025-12-04T15:04:54.7943519Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.7943520Z 2025-12-04T15:04:54.7943595Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7943828Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7943831Z 2025-12-04T15:04:54.7943918Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7943980Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7944042Z ======================= 1 failed, 26 deselected in 9.98s ======================= 2025-12-04T15:04:54.7944078Z Got exit code 1 2025-12-04T15:04:54.7944264Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda 2025-12-04T15:04:54.7944394Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.7944583Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8b1151a930ebfa73.xml 2025-12-04T15:04:54.7944643Z ============================= test session starts ============================== 2025-12-04T15:04:54.7944756Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7944798Z cachedir: .pytest_cache 2025-12-04T15:04:54.7944954Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7945000Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7945039Z configfile: pytest.ini 2025-12-04T15:04:54.7945201Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7945275Z collecting ... collected 60 items / 12 deselected / 48 selected 2025-12-04T15:04:54.7945328Z stepcurrent: skipping 12 already run items. 2025-12-04T15:04:54.7945380Z Running 15 items in this shard 2025-12-04T15:04:54.7945392Z 2025-12-04T15:04:54.7945707Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda I1204 14:56:02.175000 442368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 442437 2025-12-04T15:04:54.7945859Z I1204 14:56:02.176000 442368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 442438 2025-12-04T15:04:54.7946010Z I1204 14:56:02.177000 442368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 442439 2025-12-04T15:04:54.7946169Z I1204 14:56:02.177000 442368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 442440 2025-12-04T15:04:54.7946454Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7946504Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7947083Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7947122Z _warn_cpu_init() 2025-12-04T15:04:54.7947409Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7947498Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7947790Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7947834Z return func(*args, **kwargs) 2025-12-04T15:04:54.7948109Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7948156Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7948431Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7948475Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7948747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7948793Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7949360Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7949396Z _warn_cpu_init() 2025-12-04T15:04:54.7949962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7950019Z _warn_cpu_init() 2025-12-04T15:04:54.7950631Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7950670Z _warn_cpu_init() 2025-12-04T15:04:54.7950954Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7951045Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7951351Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7951438Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7951722Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7951807Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7952035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7952079Z return func(*args, **kwargs) 2025-12-04T15:04:54.7952306Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7952348Z return func(*args, **kwargs) 2025-12-04T15:04:54.7952574Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7952615Z return func(*args, **kwargs) 2025-12-04T15:04:54.7952838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7952877Z return func(*args, **kwargs) 2025-12-04T15:04:54.7953097Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7953137Z return func(*args, **kwargs) 2025-12-04T15:04:54.7953357Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7953396Z return func(*args, **kwargs) 2025-12-04T15:04:54.7953617Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7953657Z return func(*args, **kwargs) 2025-12-04T15:04:54.7953875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7953929Z return func(*args, **kwargs) 2025-12-04T15:04:54.7954076Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7954253Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7954546Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7954701Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7954998Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7955124Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7955410Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7955559Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7955834Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7955983Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7956257Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7956397Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7956675Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7956823Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7957320Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7957438Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7957632Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7957997Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7958112Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7958325Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7958507Z [rank2]:E1204 14:56:09.914000 442439 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7958547Z dist init r=2, world=4 2025-12-04T15:04:54.7958683Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7958840Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7959133Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7959287Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7959583Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7959709Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7959983Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7960131Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7960441Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7960588Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7960866Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7961002Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7961280Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7961428Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7961917Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.7962031Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7962224Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7962589Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7962725Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7962935Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7963098Z [rank1]:E1204 14:56:09.915000 442438 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7963137Z dist init r=1, world=4 2025-12-04T15:04:54.7963288Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7963445Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7963729Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7963895Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7964182Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7964305Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7964579Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7964726Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7965004Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7965149Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7965424Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7965561Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7965839Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7965990Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7966476Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.7966590Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7966792Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7967169Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7967281Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7967491Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7967670Z [rank0]:E1204 14:56:09.919000 442437 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.7967710Z dist init r=0, world=4 2025-12-04T15:04:54.7967846Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7968014Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7968300Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7968452Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7968740Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7968865Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7969144Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7969291Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7969566Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7969712Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7969986Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7970125Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7970428Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7970577Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7971064Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 120320 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.7971202Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7971395Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7971757Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7971882Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7972089Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7972254Z [rank3]:E1204 14:56:09.967000 442440 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.7972304Z dist init r=3, world=4 2025-12-04T15:04:54.7972638Z [rank0]:[W1204 14:56:10.610204313 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.7972679Z FAILED [9.6202s] [ 6%] 2025-12-04T15:04:54.7972681Z 2025-12-04T15:04:54.7972737Z =================================== FAILURES =================================== 2025-12-04T15:04:54.7972843Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.7972888Z Traceback (most recent call last): 2025-12-04T15:04:54.7973052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.7973096Z self._join_processes(fn) 2025-12-04T15:04:54.7973269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.7973322Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.7973498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.7973541Z raise RuntimeError(error) 2025-12-04T15:04:54.7973622Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7973666Z Traceback (most recent call last): 2025-12-04T15:04:54.7973827Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7973871Z getattr(self, test_name)() 2025-12-04T15:04:54.7974028Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7974063Z fn() 2025-12-04T15:04:54.7974214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7974254Z method(*args, **kwargs) 2025-12-04T15:04:54.7974403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7974442Z method(*args, **kwargs) 2025-12-04T15:04:54.7974591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7974629Z with policy(): 2025-12-04T15:04:54.7974778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7974829Z raise RuntimeError(msg) 2025-12-04T15:04:54.7975188Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.7975200Z 2025-12-04T15:04:54.7975274Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7975513Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7975515Z 2025-12-04T15:04:54.7975615Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7975617Z 2025-12-04T15:04:54.7975619Z 2025-12-04T15:04:54.7975693Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.7975781Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.7976012Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8b1151a930ebfa73.xml - 2025-12-04T15:04:54.7976084Z =========================== short test summary info ============================ 2025-12-04T15:04:54.7976336Z FAILED [9.6202s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.7976384Z Traceback (most recent call last): 2025-12-04T15:04:54.7976546Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7976589Z getattr(self, test_name)() 2025-12-04T15:04:54.7976745Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7976781Z fn() 2025-12-04T15:04:54.7976932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7976972Z method(*args, **kwargs) 2025-12-04T15:04:54.7977121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7977161Z method(*args, **kwargs) 2025-12-04T15:04:54.7977309Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7977348Z with policy(): 2025-12-04T15:04:54.7977498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7977539Z raise RuntimeError(msg) 2025-12-04T15:04:54.7977895Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.7977899Z 2025-12-04T15:04:54.7977972Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7978208Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7978210Z 2025-12-04T15:04:54.7978295Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7978358Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.7978420Z ======================= 1 failed, 12 deselected in 9.78s ======================= 2025-12-04T15:04:54.7978458Z Got exit code 1 2025-12-04T15:04:54.7978499Z Retrying single test... 2025-12-04T15:04:54.7978700Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4414813e7fbfb52d.xml 2025-12-04T15:04:54.7978770Z ============================= test session starts ============================== 2025-12-04T15:04:54.7978882Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.7978924Z cachedir: .pytest_cache 2025-12-04T15:04:54.7979080Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.7979125Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.7979166Z configfile: pytest.ini 2025-12-04T15:04:54.7979337Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.7979412Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.7979647Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7979691Z Running 1 items in this shard 2025-12-04T15:04:54.7979693Z 2025-12-04T15:04:54.7980015Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda I1204 14:56:14.329000 442770 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 442839 2025-12-04T15:04:54.7980200Z I1204 14:56:14.330000 442770 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 442840 2025-12-04T15:04:54.7980353Z I1204 14:56:14.330000 442770 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 442841 2025-12-04T15:04:54.7980502Z I1204 14:56:14.331000 442770 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 442842 2025-12-04T15:04:54.7980789Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7980836Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7981118Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7981165Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7981443Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7981486Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7982067Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7982106Z _warn_cpu_init() 2025-12-04T15:04:54.7982671Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7982708Z _warn_cpu_init() 2025-12-04T15:04:54.7983286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7983346Z _warn_cpu_init() 2025-12-04T15:04:54.7983631Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7983731Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7984018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7984104Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7984402Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7984487Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7984776Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.7984822Z return func(*args, **kwargs) 2025-12-04T15:04:54.7985098Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7985144Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.7985710Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.7985747Z _warn_cpu_init() 2025-12-04T15:04:54.7986030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.7986115Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.7986343Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7986385Z return func(*args, **kwargs) 2025-12-04T15:04:54.7986609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7986650Z return func(*args, **kwargs) 2025-12-04T15:04:54.7986872Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7986913Z return func(*args, **kwargs) 2025-12-04T15:04:54.7987132Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.7987194Z return func(*args, **kwargs) 2025-12-04T15:04:54.7987412Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7987454Z return func(*args, **kwargs) 2025-12-04T15:04:54.7987672Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7987711Z return func(*args, **kwargs) 2025-12-04T15:04:54.7987939Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7987979Z return func(*args, **kwargs) 2025-12-04T15:04:54.7988200Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.7988241Z return func(*args, **kwargs) 2025-12-04T15:04:54.7988396Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7988558Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7988848Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7989006Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7989293Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7989419Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7989695Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7989843Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7990117Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7990295Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7990571Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7990708Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7990983Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7991133Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7991626Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 112128 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.7991768Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7991965Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7992342Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7992460Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7992683Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7992848Z [rank2]:E1204 14:56:22.134000 442841 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.7992888Z dist init r=2, world=4 2025-12-04T15:04:54.7993026Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7993184Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7993475Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7993631Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7993917Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7994040Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7994316Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7994463Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7994737Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7994885Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7995161Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7995297Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.7995574Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.7995740Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.7996229Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.7996341Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7996544Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.7996919Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.7997032Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.7997244Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.7997405Z [rank1]:E1204 14:56:22.136000 442840 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.7997445Z dist init r=1, world=4 2025-12-04T15:04:54.7997581Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.7997741Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.7998026Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.7998179Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.7998463Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.7998585Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.7998859Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7999007Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7999279Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.7999425Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.7999698Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.7999849Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8000137Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8000322Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8000822Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.8000938Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8001145Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8001508Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8001620Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8001830Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8001994Z [rank0]:E1204 14:56:22.152000 442839 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8002035Z dist init r=0, world=4 2025-12-04T15:04:54.8002171Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8002331Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8002617Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8002771Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8003056Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8003179Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8003454Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8003599Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8003873Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8004019Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8004317Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8004453Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8004726Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8004884Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8005379Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.8005495Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8005688Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8006050Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8006162Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8006372Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8006534Z [rank3]:E1204 14:56:22.177000 442842 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8006572Z dist init r=3, world=4 2025-12-04T15:04:54.8006907Z [rank0]:[W1204 14:56:22.904391063 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8006946Z FAILED [9.7208s] [100%] 2025-12-04T15:04:54.8006948Z 2025-12-04T15:04:54.8007004Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8007108Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.8007155Z Traceback (most recent call last): 2025-12-04T15:04:54.8007316Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8007360Z self._join_processes(fn) 2025-12-04T15:04:54.8007531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8007585Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8007760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8007803Z raise RuntimeError(error) 2025-12-04T15:04:54.8007884Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8007928Z Traceback (most recent call last): 2025-12-04T15:04:54.8008096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8008148Z getattr(self, test_name)() 2025-12-04T15:04:54.8008305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8008340Z fn() 2025-12-04T15:04:54.8008492Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8008532Z method(*args, **kwargs) 2025-12-04T15:04:54.8008680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8008719Z method(*args, **kwargs) 2025-12-04T15:04:54.8008879Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8008918Z with policy(): 2025-12-04T15:04:54.8009073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8009114Z raise RuntimeError(msg) 2025-12-04T15:04:54.8009483Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.8009486Z 2025-12-04T15:04:54.8009560Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8009799Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8009801Z 2025-12-04T15:04:54.8009887Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8009890Z 2025-12-04T15:04:54.8009895Z 2025-12-04T15:04:54.8009969Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8010056Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8010322Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4414813e7fbfb52d.xml - 2025-12-04T15:04:54.8010383Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8010636Z FAILED [9.7208s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8010683Z Traceback (most recent call last): 2025-12-04T15:04:54.8010844Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8010888Z getattr(self, test_name)() 2025-12-04T15:04:54.8011044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8011080Z fn() 2025-12-04T15:04:54.8011231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8011271Z method(*args, **kwargs) 2025-12-04T15:04:54.8011420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8011460Z method(*args, **kwargs) 2025-12-04T15:04:54.8011607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8011645Z with policy(): 2025-12-04T15:04:54.8011794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8011850Z raise RuntimeError(msg) 2025-12-04T15:04:54.8012207Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.8012222Z 2025-12-04T15:04:54.8012297Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8012534Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8012537Z 2025-12-04T15:04:54.8012635Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8012698Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8012760Z ======================= 1 failed, 26 deselected in 9.88s ======================= 2025-12-04T15:04:54.8012799Z Got exit code 1 2025-12-04T15:04:54.8012840Z Retrying single test... 2025-12-04T15:04:54.8013046Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7985ca189ce8bda8.xml 2025-12-04T15:04:54.8013103Z ============================= test session starts ============================== 2025-12-04T15:04:54.8013214Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8013255Z cachedir: .pytest_cache 2025-12-04T15:04:54.8013412Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8013460Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8013502Z configfile: pytest.ini 2025-12-04T15:04:54.8013665Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8013741Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8013973Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8014017Z Running 1 items in this shard 2025-12-04T15:04:54.8014019Z 2025-12-04T15:04:54.8014332Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda I1204 14:56:26.664000 443172 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 443241 2025-12-04T15:04:54.8014489Z I1204 14:56:26.665000 443172 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 443242 2025-12-04T15:04:54.8014638Z I1204 14:56:26.665000 443172 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 443243 2025-12-04T15:04:54.8014790Z I1204 14:56:26.666000 443172 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 443244 2025-12-04T15:04:54.8015077Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8015124Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.8015698Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8015736Z _warn_cpu_init() 2025-12-04T15:04:54.8016025Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8016080Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.8016357Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8016400Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.8016974Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8017015Z _warn_cpu_init() 2025-12-04T15:04:54.8017590Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8017628Z _warn_cpu_init() 2025-12-04T15:04:54.8017915Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8018003Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.8018280Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8018327Z return wrapper_cls(module, **kwargs) 2025-12-04T15:04:54.8018898Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8018934Z _warn_cpu_init() 2025-12-04T15:04:54.8019227Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8019271Z return func(*args, **kwargs) 2025-12-04T15:04:54.8019555Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8019640Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.8019926Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8020012Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.8020330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8020439Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T15:04:54.8020667Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8020711Z return func(*args, **kwargs) 2025-12-04T15:04:54.8020933Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8020975Z return func(*args, **kwargs) 2025-12-04T15:04:54.8021209Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8021252Z return func(*args, **kwargs) 2025-12-04T15:04:54.8021469Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8021524Z return func(*args, **kwargs) 2025-12-04T15:04:54.8021742Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8021784Z return func(*args, **kwargs) 2025-12-04T15:04:54.8022001Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8022042Z return func(*args, **kwargs) 2025-12-04T15:04:54.8022259Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8022301Z return func(*args, **kwargs) 2025-12-04T15:04:54.8022519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8022561Z return func(*args, **kwargs) 2025-12-04T15:04:54.8022706Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8022868Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8023160Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8023316Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8023606Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8023730Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8024006Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8024155Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8024430Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8024596Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8024870Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8025007Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8025291Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8025440Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8025943Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 124416 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.8026059Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8026254Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8026620Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8026735Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8026945Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8027110Z [rank0]:E1204 14:56:34.464000 443241 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8027148Z dist init r=0, world=4 2025-12-04T15:04:54.8027287Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8027444Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8027731Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8027887Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8028172Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8028296Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8028575Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8028732Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8029016Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8029162Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8029452Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8029589Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8029863Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8030020Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8030544Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 116224 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T15:04:54.8030657Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8030851Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8031217Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8031331Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8031541Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8031706Z [rank3]:E1204 14:56:34.475000 443244 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8031745Z dist init r=3, world=4 2025-12-04T15:04:54.8031882Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8032043Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8032328Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8032480Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8032764Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8032901Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8033188Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8033335Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8033611Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8033768Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8034042Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8034191Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8034466Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8034612Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8035097Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 120320 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T15:04:54.8035211Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8035404Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8035769Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8035881Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8036089Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8036252Z [rank2]:E1204 14:56:34.475000 443243 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8036291Z dist init r=2, world=4 2025-12-04T15:04:54.8036427Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8036585Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8036870Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8037023Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8037319Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8037452Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8037726Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8037881Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8038155Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8038301Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8038583Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8038718Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8038996Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8039143Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8039634Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 120320 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T15:04:54.8039749Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8039942Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8040344Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8040457Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8040667Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8040829Z [rank1]:E1204 14:56:34.517000 443242 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8040867Z dist init r=1, world=4 2025-12-04T15:04:54.8041203Z [rank0]:[W1204 14:56:34.135255826 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8041241Z FAILED [9.6204s] [100%] 2025-12-04T15:04:54.8041257Z 2025-12-04T15:04:54.8041313Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8041431Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.8041478Z Traceback (most recent call last): 2025-12-04T15:04:54.8041640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8041685Z self._join_processes(fn) 2025-12-04T15:04:54.8041858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8041913Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8042104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8042149Z raise RuntimeError(error) 2025-12-04T15:04:54.8042227Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8042273Z Traceback (most recent call last): 2025-12-04T15:04:54.8042433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8042488Z getattr(self, test_name)() 2025-12-04T15:04:54.8042644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8042680Z fn() 2025-12-04T15:04:54.8042829Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8042872Z method(*args, **kwargs) 2025-12-04T15:04:54.8043021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8043062Z method(*args, **kwargs) 2025-12-04T15:04:54.8043209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8043248Z with policy(): 2025-12-04T15:04:54.8043397Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8043440Z raise RuntimeError(msg) 2025-12-04T15:04:54.8043799Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 124416 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.8043801Z 2025-12-04T15:04:54.8043875Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8044116Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8044119Z 2025-12-04T15:04:54.8044206Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8044209Z 2025-12-04T15:04:54.8044211Z 2025-12-04T15:04:54.8044287Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8044374Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8044605Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7985ca189ce8bda8.xml - 2025-12-04T15:04:54.8044665Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8044921Z FAILED [9.6204s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8044967Z Traceback (most recent call last): 2025-12-04T15:04:54.8045140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8045202Z getattr(self, test_name)() 2025-12-04T15:04:54.8045360Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8045395Z fn() 2025-12-04T15:04:54.8045543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8045584Z method(*args, **kwargs) 2025-12-04T15:04:54.8045732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8045772Z method(*args, **kwargs) 2025-12-04T15:04:54.8045930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8045968Z with policy(): 2025-12-04T15:04:54.8046120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8046162Z raise RuntimeError(msg) 2025-12-04T15:04:54.8046532Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 124416 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T15:04:54.8046535Z 2025-12-04T15:04:54.8046609Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8046844Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8046847Z 2025-12-04T15:04:54.8046933Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8046997Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8047061Z ======================= 1 failed, 26 deselected in 9.78s ======================= 2025-12-04T15:04:54.8047098Z Got exit code 1 2025-12-04T15:04:54.8047286Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda 2025-12-04T15:04:54.8047410Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8047597Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f49bcb9e26e344c.xml 2025-12-04T15:04:54.8047655Z ============================= test session starts ============================== 2025-12-04T15:04:54.8047766Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8047808Z cachedir: .pytest_cache 2025-12-04T15:04:54.8047965Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8048012Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8048051Z configfile: pytest.ini 2025-12-04T15:04:54.8048211Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8048285Z collecting ... collected 60 items / 13 deselected / 47 selected 2025-12-04T15:04:54.8048338Z stepcurrent: skipping 13 already run items. 2025-12-04T15:04:54.8048381Z Running 14 items in this shard 2025-12-04T15:04:54.8048383Z 2025-12-04T15:04:54.8048694Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 14:56:38.932000 443574 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 443643 2025-12-04T15:04:54.8048846Z I1204 14:56:38.933000 443574 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 443644 2025-12-04T15:04:54.8049017Z I1204 14:56:38.934000 443574 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 443645 2025-12-04T15:04:54.8049166Z I1204 14:56:38.934000 443574 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 443646 2025-12-04T15:04:54.8049748Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8049786Z _warn_cpu_init() 2025-12-04T15:04:54.8050073Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8050118Z return func(*args, **kwargs) 2025-12-04T15:04:54.8050726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8050765Z _warn_cpu_init() 2025-12-04T15:04:54.8051326Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8051368Z _warn_cpu_init() 2025-12-04T15:04:54.8051933Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8051969Z _warn_cpu_init() 2025-12-04T15:04:54.8052111Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8052272Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8052563Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8052716Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8053002Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8053126Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8053416Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8053576Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8053849Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8053996Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8054281Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8054419Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8054705Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8054853Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8055334Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T15:04:54.8055448Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8055646Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8056007Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8056121Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8056331Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8056494Z [rank2]:E1204 14:56:46.915000 443645 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8056535Z dist init r=2, world=4 2025-12-04T15:04:54.8056672Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8056830Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8057115Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8057268Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8057551Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8057693Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8057970Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8058117Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8058400Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8058547Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8058832Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8058967Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8059242Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8059389Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8059866Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T15:04:54.8059980Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8060200Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8060561Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8060673Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8060884Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8061046Z [rank1]:E1204 14:56:46.917000 443644 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8061085Z dist init r=1, world=4 2025-12-04T15:04:54.8061221Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8061379Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8061663Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8061838Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8062134Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8062257Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8062551Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8062699Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8062974Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8063132Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8063405Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8063540Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8063816Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8063965Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8064441Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 17920 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T15:04:54.8064554Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8064749Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8065109Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8065223Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8065433Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8065595Z [rank0]:E1204 14:56:46.929000 443643 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8065633Z dist init r=0, world=4 2025-12-04T15:04:54.8068312Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8068475Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8068800Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8068952Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8069235Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8069372Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8069650Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8069814Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8070089Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8070275Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8070550Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8070688Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8070966Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8071113Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8071590Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T15:04:54.8071704Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8071905Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8072267Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8072379Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8072588Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8072750Z [rank3]:E1204 14:56:46.942000 443646 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8072821Z dist init r=3, world=4 2025-12-04T15:04:54.8073158Z [rank0]:[W1204 14:56:47.635925507 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8073197Z FAILED [9.9191s] [ 7%] 2025-12-04T15:04:54.8073199Z 2025-12-04T15:04:54.8073256Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8073354Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __ 2025-12-04T15:04:54.8073400Z Traceback (most recent call last): 2025-12-04T15:04:54.8073577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8073622Z self._join_processes(fn) 2025-12-04T15:04:54.8073794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8073850Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8074040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8074084Z raise RuntimeError(error) 2025-12-04T15:04:54.8074164Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8074208Z Traceback (most recent call last): 2025-12-04T15:04:54.8074369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8074412Z getattr(self, test_name)() 2025-12-04T15:04:54.8074568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8074604Z fn() 2025-12-04T15:04:54.8074755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8074796Z method(*args, **kwargs) 2025-12-04T15:04:54.8074945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8074985Z method(*args, **kwargs) 2025-12-04T15:04:54.8075133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8075169Z with policy(): 2025-12-04T15:04:54.8075319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8075360Z raise RuntimeError(msg) 2025-12-04T15:04:54.8075712Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T15:04:54.8075715Z 2025-12-04T15:04:54.8075789Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8076025Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8076027Z 2025-12-04T15:04:54.8076114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8076116Z 2025-12-04T15:04:54.8076119Z 2025-12-04T15:04:54.8076194Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8076283Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8076517Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f49bcb9e26e344c.xml - 2025-12-04T15:04:54.8076592Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8076857Z FAILED [9.9191s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8076904Z Traceback (most recent call last): 2025-12-04T15:04:54.8077066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8077109Z getattr(self, test_name)() 2025-12-04T15:04:54.8077276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8077312Z fn() 2025-12-04T15:04:54.8077462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8077503Z method(*args, **kwargs) 2025-12-04T15:04:54.8077651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8077692Z method(*args, **kwargs) 2025-12-04T15:04:54.8077859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8077898Z with policy(): 2025-12-04T15:04:54.8078048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8078089Z raise RuntimeError(msg) 2025-12-04T15:04:54.8078442Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T15:04:54.8078445Z 2025-12-04T15:04:54.8078519Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8078753Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8078755Z 2025-12-04T15:04:54.8078841Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8078904Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8078966Z ====================== 1 failed, 13 deselected in 10.06s ======================= 2025-12-04T15:04:54.8079004Z Got exit code 1 2025-12-04T15:04:54.8079043Z Retrying single test... 2025-12-04T15:04:54.8079233Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-acce6c11eee53540.xml 2025-12-04T15:04:54.8079290Z ============================= test session starts ============================== 2025-12-04T15:04:54.8079408Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8079449Z cachedir: .pytest_cache 2025-12-04T15:04:54.8079607Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8079652Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8079693Z configfile: pytest.ini 2025-12-04T15:04:54.8079853Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8079926Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8080154Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8080230Z Running 1 items in this shard 2025-12-04T15:04:54.8080245Z 2025-12-04T15:04:54.8080551Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 14:56:51.413000 443976 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 444045 2025-12-04T15:04:54.8080720Z I1204 14:56:51.413000 443976 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 444046 2025-12-04T15:04:54.8080869Z I1204 14:56:51.414000 443976 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 444047 2025-12-04T15:04:54.8081019Z I1204 14:56:51.415000 443976 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 444048 2025-12-04T15:04:54.8081615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8081656Z _warn_cpu_init() 2025-12-04T15:04:54.8082230Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8082268Z _warn_cpu_init() 2025-12-04T15:04:54.8082831Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8082869Z _warn_cpu_init() 2025-12-04T15:04:54.8083430Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8083467Z _warn_cpu_init() 2025-12-04T15:04:54.8083756Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8083803Z return func(*args, **kwargs) 2025-12-04T15:04:54.8083943Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8084104Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8084389Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8084543Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8084825Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8084973Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8085248Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8085393Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8085677Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8085824Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8086109Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8086244Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8086518Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8086666Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8087147Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T15:04:54.8087263Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8087455Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8087816Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8087930Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8088143Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8088305Z [rank1]:E1204 14:56:59.529000 444046 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8088441Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8088601Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8088885Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8089052Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8089346Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8089470Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8089754Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8089900Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8090208Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8090367Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8090646Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8090780Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8091055Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8091201Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8091677Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T15:04:54.8091791Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8091983Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8092341Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8092454Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8092666Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8092829Z [rank3]:E1204 14:56:59.529000 444048 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8092868Z dist init r=1, world=4 2025-12-04T15:04:54.8092906Z dist init r=3, world=4 2025-12-04T15:04:54.8093043Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8093212Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8093510Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8093661Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8093942Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8094083Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8094355Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8094617Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8094891Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8095036Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8095312Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8095448Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8095726Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8095874Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8096412Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T15:04:54.8096527Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8096724Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8097086Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8097198Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8097415Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8097580Z [rank0]:E1204 14:56:59.553000 444045 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8097640Z dist init r=0, world=4 2025-12-04T15:04:54.8097776Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8097935Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8098218Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8098382Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8098666Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8098790Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8099076Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8099222Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8099496Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8099642Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8099922Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8100056Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8100379Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8100526Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8101002Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 24064 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T15:04:54.8101116Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8101308Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8101667Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8101777Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8102001Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8102178Z [rank2]:E1204 14:56:59.563000 444047 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8102217Z dist init r=2, world=4 2025-12-04T15:04:54.8102553Z [rank0]:[W1204 14:56:59.338347553 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8102593Z FAILED [10.0198s] [100%] 2025-12-04T15:04:54.8102610Z 2025-12-04T15:04:54.8102667Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8102765Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __ 2025-12-04T15:04:54.8102814Z Traceback (most recent call last): 2025-12-04T15:04:54.8102976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8103036Z self._join_processes(fn) 2025-12-04T15:04:54.8103209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8103262Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8103438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8103482Z raise RuntimeError(error) 2025-12-04T15:04:54.8103560Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8103604Z Traceback (most recent call last): 2025-12-04T15:04:54.8103762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8103806Z getattr(self, test_name)() 2025-12-04T15:04:54.8103963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8103998Z fn() 2025-12-04T15:04:54.8104147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8104187Z method(*args, **kwargs) 2025-12-04T15:04:54.8104336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8104376Z method(*args, **kwargs) 2025-12-04T15:04:54.8104526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8104563Z with policy(): 2025-12-04T15:04:54.8104712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8104754Z raise RuntimeError(msg) 2025-12-04T15:04:54.8105106Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T15:04:54.8105111Z 2025-12-04T15:04:54.8105185Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8105417Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8105420Z 2025-12-04T15:04:54.8105508Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8105511Z 2025-12-04T15:04:54.8105513Z 2025-12-04T15:04:54.8105599Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8105695Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8105926Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-acce6c11eee53540.xml - 2025-12-04T15:04:54.8105986Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8106233Z FAILED [10.0198s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8106278Z Traceback (most recent call last): 2025-12-04T15:04:54.8106452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8106495Z getattr(self, test_name)() 2025-12-04T15:04:54.8106654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8106689Z fn() 2025-12-04T15:04:54.8106848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8106889Z method(*args, **kwargs) 2025-12-04T15:04:54.8107039Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8107078Z method(*args, **kwargs) 2025-12-04T15:04:54.8107228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8107264Z with policy(): 2025-12-04T15:04:54.8107414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8107455Z raise RuntimeError(msg) 2025-12-04T15:04:54.8107808Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T15:04:54.8107811Z 2025-12-04T15:04:54.8107886Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8108118Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8108120Z 2025-12-04T15:04:54.8108208Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8108271Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8108333Z ====================== 1 failed, 26 deselected in 10.18s ======================= 2025-12-04T15:04:54.8108372Z Got exit code 1 2025-12-04T15:04:54.8108414Z Retrying single test... 2025-12-04T15:04:54.8108600Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-50bd696d2e06cf26.xml 2025-12-04T15:04:54.8108659Z ============================= test session starts ============================== 2025-12-04T15:04:54.8108772Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8108815Z cachedir: .pytest_cache 2025-12-04T15:04:54.8108972Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8109017Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8109057Z configfile: pytest.ini 2025-12-04T15:04:54.8109218Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8109301Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8109546Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8109591Z Running 1 items in this shard 2025-12-04T15:04:54.8109595Z 2025-12-04T15:04:54.8109900Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 14:57:03.975000 444378 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 444447 2025-12-04T15:04:54.8110053Z I1204 14:57:03.975000 444378 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 444448 2025-12-04T15:04:54.8110254Z I1204 14:57:03.976000 444378 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 444449 2025-12-04T15:04:54.8110403Z I1204 14:57:03.977000 444378 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 444450 2025-12-04T15:04:54.8110990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8111028Z _warn_cpu_init() 2025-12-04T15:04:54.8111318Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8111362Z return func(*args, **kwargs) 2025-12-04T15:04:54.8111925Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8111964Z _warn_cpu_init() 2025-12-04T15:04:54.8112524Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8112561Z _warn_cpu_init() 2025-12-04T15:04:54.8113120Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8113157Z _warn_cpu_init() 2025-12-04T15:04:54.8113301Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8113465Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8113753Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8113933Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8114219Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8114344Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8114628Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8114776Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8115062Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8115209Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8115483Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8115619Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8115897Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8116046Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8116532Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T15:04:54.8116652Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8116848Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8117209Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8117324Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8117535Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8117699Z [rank0]:E1204 14:57:12.049000 444447 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8117742Z dist init r=0, world=4 2025-12-04T15:04:54.8117878Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8118047Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8118346Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8118500Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8118798Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8118926Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8119204Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8119359Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8119636Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8119780Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8120056Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8120225Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8120503Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8120651Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8121126Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T15:04:54.8121241Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8121435Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8121791Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8121903Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8122114Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8122279Z [rank1]:E1204 14:57:12.073000 444448 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8122341Z dist init r=1, world=4 2025-12-04T15:04:54.8122479Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8122637Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8122926Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8123093Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8123377Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8123500Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8123788Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8123935Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8124209Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8124353Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8124628Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8124762Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8125035Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8125183Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8125662Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 26112 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T15:04:54.8125777Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8125972Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8126331Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8126443Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8126677Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8126842Z [rank2]:E1204 14:57:12.074000 444449 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8126879Z dist init r=2, world=4 2025-12-04T15:04:54.8127020Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8127176Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8127474Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8127628Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8127921Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8128043Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8128316Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8128462Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8128735Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8128881Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8129153Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8129288Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8129563Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8129709Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8130224Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T15:04:54.8130336Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8130531Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8130887Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8131025Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8131235Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8131395Z [rank3]:E1204 14:57:12.120000 444450 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8131434Z dist init r=3, world=4 2025-12-04T15:04:54.8131776Z [rank0]:[W1204 14:57:12.725671438 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8131819Z FAILED [10.0226s] [100%] 2025-12-04T15:04:54.8131821Z 2025-12-04T15:04:54.8131876Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8131987Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __ 2025-12-04T15:04:54.8132032Z Traceback (most recent call last): 2025-12-04T15:04:54.8132194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8132238Z self._join_processes(fn) 2025-12-04T15:04:54.8132412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8132464Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8132641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8132687Z raise RuntimeError(error) 2025-12-04T15:04:54.8132765Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8132808Z Traceback (most recent call last): 2025-12-04T15:04:54.8132969Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8133010Z getattr(self, test_name)() 2025-12-04T15:04:54.8133167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8133201Z fn() 2025-12-04T15:04:54.8133351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8133391Z method(*args, **kwargs) 2025-12-04T15:04:54.8133540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8133581Z method(*args, **kwargs) 2025-12-04T15:04:54.8133730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8133767Z with policy(): 2025-12-04T15:04:54.8133920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8133962Z raise RuntimeError(msg) 2025-12-04T15:04:54.8134316Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T15:04:54.8134318Z 2025-12-04T15:04:54.8134392Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8134625Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8134647Z 2025-12-04T15:04:54.8134736Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8134739Z 2025-12-04T15:04:54.8134740Z 2025-12-04T15:04:54.8134814Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8134901Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8135130Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-50bd696d2e06cf26.xml - 2025-12-04T15:04:54.8135205Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8135454Z FAILED [10.0226s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8135501Z Traceback (most recent call last): 2025-12-04T15:04:54.8135751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8135803Z getattr(self, test_name)() 2025-12-04T15:04:54.8135962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8135997Z fn() 2025-12-04T15:04:54.8136147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8136187Z method(*args, **kwargs) 2025-12-04T15:04:54.8136342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8136381Z method(*args, **kwargs) 2025-12-04T15:04:54.8136530Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8136568Z with policy(): 2025-12-04T15:04:54.8136718Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8136758Z raise RuntimeError(msg) 2025-12-04T15:04:54.8137113Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T15:04:54.8137116Z 2025-12-04T15:04:54.8137188Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8137423Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8137427Z 2025-12-04T15:04:54.8137512Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8137577Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8137641Z ====================== 1 failed, 26 deselected in 10.18s ======================= 2025-12-04T15:04:54.8137678Z Got exit code 1 2025-12-04T15:04:54.8137859Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T15:04:54.8137985Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8138172Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b4dbf8696cc1aacf.xml 2025-12-04T15:04:54.8138229Z ============================= test session starts ============================== 2025-12-04T15:04:54.8138341Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8138403Z cachedir: .pytest_cache 2025-12-04T15:04:54.8138559Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8138605Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8138645Z configfile: pytest.ini 2025-12-04T15:04:54.8138805Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8138878Z collecting ... collected 60 items / 14 deselected / 46 selected 2025-12-04T15:04:54.8138932Z stepcurrent: skipping 14 already run items. 2025-12-04T15:04:54.8138975Z Running 13 items in this shard 2025-12-04T15:04:54.8138977Z 2025-12-04T15:04:54.8139303Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda I1204 14:57:16.610000 444780 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 444849 2025-12-04T15:04:54.8139458Z I1204 14:57:16.610000 444780 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 444850 2025-12-04T15:04:54.8139625Z I1204 14:57:16.611000 444780 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 444851 2025-12-04T15:04:54.8139775Z I1204 14:57:16.611000 444780 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 444852 2025-12-04T15:04:54.8140068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8140120Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8140443Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8140494Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8141073Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8141109Z _warn_cpu_init() 2025-12-04T15:04:54.8141675Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8141714Z _warn_cpu_init() 2025-12-04T15:04:54.8141998Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8142048Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8142613Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8142665Z _warn_cpu_init() 2025-12-04T15:04:54.8142963Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8143042Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8143323Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8143399Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8143695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8143770Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8144067Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8144116Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8144682Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8144719Z _warn_cpu_init() 2025-12-04T15:04:54.8145004Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8145079Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8146356Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8146485Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8146713Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8146756Z return func(*args, **kwargs) 2025-12-04T15:04:54.8148014Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8148164Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8148390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8148432Z return func(*args, **kwargs) 2025-12-04T15:04:54.8149699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8149821Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8150044Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8150086Z return func(*args, **kwargs) 2025-12-04T15:04:54.8151364Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8151484Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8151708Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8151750Z return func(*args, **kwargs) 2025-12-04T15:04:54.8151968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8152033Z return func(*args, **kwargs) 2025-12-04T15:04:54.8152252Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8152293Z return func(*args, **kwargs) 2025-12-04T15:04:54.8152511Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8152551Z return func(*args, **kwargs) 2025-12-04T15:04:54.8152780Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8152821Z return func(*args, **kwargs) 2025-12-04T15:04:54.8153108Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8153149Z return func(*args, **kwargs) 2025-12-04T15:04:54.8153307Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8153470Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8153761Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8153916Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8154202Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8154327Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8154603Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8154750Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8155026Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8155172Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8155447Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8155582Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8155859Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8156007Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8156499Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T15:04:54.8156623Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8156816Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8157198Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8157311Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8157532Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8157696Z [rank2]:E1204 14:57:24.241000 444851 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8157735Z dist init r=2, world=4 2025-12-04T15:04:54.8157872Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8158030Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8158317Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8158471Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8158754Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8158875Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8159150Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8159296Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8159571Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8159717Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8159989Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8160125Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8160428Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8160602Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8161083Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T15:04:54.8161209Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8161403Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8161775Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8161889Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8162097Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8162261Z [rank3]:E1204 14:57:24.244000 444852 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8162299Z dist init r=3, world=4 2025-12-04T15:04:54.8162436Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8162594Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8162879Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8163033Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8163315Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8163437Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8163711Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8163858Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8164130Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8164277Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8164549Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8164706Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8164981Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8165127Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8165619Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T15:04:54.8165733Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8165937Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8166296Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8166408Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8166616Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8166777Z [rank0]:E1204 14:57:24.250000 444849 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8166817Z dist init r=0, world=4 2025-12-04T15:04:54.8166953Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8167111Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8167395Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8167548Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8167829Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8167955Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8168229Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8168373Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8168647Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8168803Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8169087Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8169220Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8169496Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8169651Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8170139Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T15:04:54.8170296Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8170488Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8170849Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8170960Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8171170Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8171331Z [rank1]:E1204 14:57:24.256000 444850 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8171371Z dist init r=1, world=4 2025-12-04T15:04:54.8171705Z [rank0]:[W1204 14:57:24.956427189 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8171746Z FAILED [9.6206s] [ 7%] 2025-12-04T15:04:54.8171748Z 2025-12-04T15:04:54.8171804Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8171905Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda __ 2025-12-04T15:04:54.8171952Z Traceback (most recent call last): 2025-12-04T15:04:54.8172115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8172160Z self._join_processes(fn) 2025-12-04T15:04:54.8172330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8172384Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8172559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8172602Z raise RuntimeError(error) 2025-12-04T15:04:54.8172680Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8172740Z Traceback (most recent call last): 2025-12-04T15:04:54.8172898Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8172960Z getattr(self, test_name)() 2025-12-04T15:04:54.8173116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8173152Z fn() 2025-12-04T15:04:54.8173300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8173341Z method(*args, **kwargs) 2025-12-04T15:04:54.8173488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8173540Z method(*args, **kwargs) 2025-12-04T15:04:54.8173688Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8173727Z with policy(): 2025-12-04T15:04:54.8173878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8173920Z raise RuntimeError(msg) 2025-12-04T15:04:54.8174290Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T15:04:54.8174292Z 2025-12-04T15:04:54.8174368Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8174607Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8174610Z 2025-12-04T15:04:54.8174699Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8174702Z 2025-12-04T15:04:54.8174761Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8174804Z Traceback (most recent call last): 2025-12-04T15:04:54.8174966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8175007Z getattr(self, test_name)() 2025-12-04T15:04:54.8175164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8175198Z fn() 2025-12-04T15:04:54.8175348Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8175389Z method(*args, **kwargs) 2025-12-04T15:04:54.8175537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8175576Z method(*args, **kwargs) 2025-12-04T15:04:54.8175727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8175764Z with policy(): 2025-12-04T15:04:54.8175916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8175956Z raise RuntimeError(msg) 2025-12-04T15:04:54.8176314Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T15:04:54.8176316Z 2025-12-04T15:04:54.8176389Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8176624Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8176645Z 2025-12-04T15:04:54.8176734Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8176736Z 2025-12-04T15:04:54.8176738Z 2025-12-04T15:04:54.8176815Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8176903Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8177135Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b4dbf8696cc1aacf.xml - 2025-12-04T15:04:54.8177196Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8177459Z FAILED [9.6206s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8177507Z Traceback (most recent call last): 2025-12-04T15:04:54.8177666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8177709Z getattr(self, test_name)() 2025-12-04T15:04:54.8177877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8177913Z fn() 2025-12-04T15:04:54.8178061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8178102Z method(*args, **kwargs) 2025-12-04T15:04:54.8178250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8178290Z method(*args, **kwargs) 2025-12-04T15:04:54.8178437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8178475Z with policy(): 2025-12-04T15:04:54.8178624Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8178664Z raise RuntimeError(msg) 2025-12-04T15:04:54.8179017Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T15:04:54.8179021Z 2025-12-04T15:04:54.8179092Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8179326Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8179328Z 2025-12-04T15:04:54.8179412Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8179415Z 2025-12-04T15:04:54.8179474Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8179517Z Traceback (most recent call last): 2025-12-04T15:04:54.8179677Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8179717Z getattr(self, test_name)() 2025-12-04T15:04:54.8179874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8179907Z fn() 2025-12-04T15:04:54.8180055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8180095Z method(*args, **kwargs) 2025-12-04T15:04:54.8180271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8180326Z method(*args, **kwargs) 2025-12-04T15:04:54.8180473Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8180525Z with policy(): 2025-12-04T15:04:54.8180675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8180715Z raise RuntimeError(msg) 2025-12-04T15:04:54.8181065Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T15:04:54.8181067Z 2025-12-04T15:04:54.8181154Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8181385Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8181389Z 2025-12-04T15:04:54.8181474Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8181549Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8181613Z ======================= 1 failed, 14 deselected in 9.78s ======================= 2025-12-04T15:04:54.8181649Z Got exit code 1 2025-12-04T15:04:54.8181690Z Retrying single test... 2025-12-04T15:04:54.8181874Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8b804861c496a109.xml 2025-12-04T15:04:54.8181937Z ============================= test session starts ============================== 2025-12-04T15:04:54.8182050Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8182093Z cachedir: .pytest_cache 2025-12-04T15:04:54.8182249Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8182295Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8182335Z configfile: pytest.ini 2025-12-04T15:04:54.8182498Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8182573Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8182804Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8182848Z Running 1 items in this shard 2025-12-04T15:04:54.8182850Z 2025-12-04T15:04:54.8183162Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda I1204 14:57:28.641000 445182 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 445251 2025-12-04T15:04:54.8183315Z I1204 14:57:28.641000 445182 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 445252 2025-12-04T15:04:54.8183467Z I1204 14:57:28.642000 445182 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 445253 2025-12-04T15:04:54.8183615Z I1204 14:57:28.642000 445182 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 445254 2025-12-04T15:04:54.8183902Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8183955Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8184237Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8184306Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8184585Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8184633Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8185216Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8185255Z _warn_cpu_init() 2025-12-04T15:04:54.8185836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8185874Z _warn_cpu_init() 2025-12-04T15:04:54.8186156Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8186204Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8186766Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8186806Z _warn_cpu_init() 2025-12-04T15:04:54.8187368Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8187404Z _warn_cpu_init() 2025-12-04T15:04:54.8187687Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8187765Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8188047Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8188123Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8188404Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8188477Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8188756Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8188861Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8190148Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8190304Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8190533Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8190576Z return func(*args, **kwargs) 2025-12-04T15:04:54.8191838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8191962Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8193215Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8193336Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8193576Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8193632Z return func(*args, **kwargs) 2025-12-04T15:04:54.8193853Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8193895Z return func(*args, **kwargs) 2025-12-04T15:04:54.8195173Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8195294Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8195518Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8195559Z return func(*args, **kwargs) 2025-12-04T15:04:54.8195779Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8195821Z return func(*args, **kwargs) 2025-12-04T15:04:54.8196038Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8196078Z return func(*args, **kwargs) 2025-12-04T15:04:54.8196294Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8196333Z return func(*args, **kwargs) 2025-12-04T15:04:54.8196549Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8196588Z return func(*args, **kwargs) 2025-12-04T15:04:54.8196875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8196916Z return func(*args, **kwargs) 2025-12-04T15:04:54.8197059Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8197219Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8197506Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8197660Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8197954Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8198090Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8198370Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8198517Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8198800Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8198948Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8199234Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8199369Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8199651Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8199796Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8200310Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T15:04:54.8200425Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8200619Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8200983Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8201096Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8201308Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8201469Z [rank2]:E1204 14:57:36.176000 445253 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8201508Z dist init r=2, world=4 2025-12-04T15:04:54.8201645Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8201803Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8202089Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8202269Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8202551Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8202674Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8202961Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8203107Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8203403Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8203549Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8203823Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8203958Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8204235Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8204384Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8204861Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T15:04:54.8204976Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8205170Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8205534Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8205647Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8205856Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8206019Z [rank0]:E1204 14:57:36.183000 445251 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8206060Z dist init r=0, world=4 2025-12-04T15:04:54.8206198Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8206366Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8206662Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8206813Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8207208Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8207332Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8207609Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8207765Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8208041Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8208187Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8208460Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8208596Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8208875Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8209021Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8209496Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T15:04:54.8209612Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8209806Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8210164Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8210322Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8210530Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8210707Z [rank3]:E1204 14:57:36.233000 445254 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8210759Z dist init r=3, world=4 2025-12-04T15:04:54.8210896Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8211053Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8211338Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8211500Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8211784Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8211921Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8212198Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8212344Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8212619Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8212765Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8213038Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8213174Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8213452Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8213599Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8214079Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T15:04:54.8214196Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8214388Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8214747Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8214869Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8215090Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8215250Z [rank1]:E1204 14:57:36.240000 445252 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8215289Z dist init r=1, world=4 2025-12-04T15:04:54.8215638Z [rank0]:[W1204 14:57:36.876702246 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8215680Z FAILED [9.4211s] [100%] 2025-12-04T15:04:54.8215682Z 2025-12-04T15:04:54.8215736Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8215838Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda __ 2025-12-04T15:04:54.8215884Z Traceback (most recent call last): 2025-12-04T15:04:54.8216056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8216100Z self._join_processes(fn) 2025-12-04T15:04:54.8216271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8216323Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8216500Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8216544Z raise RuntimeError(error) 2025-12-04T15:04:54.8216623Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8216667Z Traceback (most recent call last): 2025-12-04T15:04:54.8216827Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8216870Z getattr(self, test_name)() 2025-12-04T15:04:54.8217029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8217063Z fn() 2025-12-04T15:04:54.8217212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8217252Z method(*args, **kwargs) 2025-12-04T15:04:54.8217402Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8217444Z method(*args, **kwargs) 2025-12-04T15:04:54.8217593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8217631Z with policy(): 2025-12-04T15:04:54.8217781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8217823Z raise RuntimeError(msg) 2025-12-04T15:04:54.8218184Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T15:04:54.8218186Z 2025-12-04T15:04:54.8218260Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8218494Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8218497Z 2025-12-04T15:04:54.8218584Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8218597Z 2025-12-04T15:04:54.8218599Z 2025-12-04T15:04:54.8218682Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8218769Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8218999Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8b804861c496a109.xml - 2025-12-04T15:04:54.8219060Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8219309Z FAILED [9.4211s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8219378Z Traceback (most recent call last): 2025-12-04T15:04:54.8219540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8219585Z getattr(self, test_name)() 2025-12-04T15:04:54.8219743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8219777Z fn() 2025-12-04T15:04:54.8219936Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8219976Z method(*args, **kwargs) 2025-12-04T15:04:54.8220125Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8220163Z method(*args, **kwargs) 2025-12-04T15:04:54.8220353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8220390Z with policy(): 2025-12-04T15:04:54.8220539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8220580Z raise RuntimeError(msg) 2025-12-04T15:04:54.8220935Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T15:04:54.8220937Z 2025-12-04T15:04:54.8221009Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8221243Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8221245Z 2025-12-04T15:04:54.8221332Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8221394Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8221456Z ======================= 1 failed, 26 deselected in 9.58s ======================= 2025-12-04T15:04:54.8221495Z Got exit code 1 2025-12-04T15:04:54.8221533Z Retrying single test... 2025-12-04T15:04:54.8221722Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6f0f25d8cb00bfe3.xml 2025-12-04T15:04:54.8221780Z ============================= test session starts ============================== 2025-12-04T15:04:54.8221892Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8221932Z cachedir: .pytest_cache 2025-12-04T15:04:54.8222088Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8222133Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8222174Z configfile: pytest.ini 2025-12-04T15:04:54.8222332Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8222422Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8222664Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8222707Z Running 1 items in this shard 2025-12-04T15:04:54.8222709Z 2025-12-04T15:04:54.8223017Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda I1204 14:57:40.634000 445584 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 445653 2025-12-04T15:04:54.8223184Z I1204 14:57:40.634000 445584 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 445654 2025-12-04T15:04:54.8223336Z I1204 14:57:40.635000 445584 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 445655 2025-12-04T15:04:54.8223485Z I1204 14:57:40.635000 445584 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 445656 2025-12-04T15:04:54.8223788Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8223839Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8224418Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8224457Z _warn_cpu_init() 2025-12-04T15:04:54.8224741Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8224792Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8225072Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8225124Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8225404Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8225452Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8226018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8226056Z _warn_cpu_init() 2025-12-04T15:04:54.8226619Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8226666Z _warn_cpu_init() 2025-12-04T15:04:54.8227228Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8227276Z _warn_cpu_init() 2025-12-04T15:04:54.8227576Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8227653Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8227937Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8228014Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8228306Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8228382Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8228665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8228739Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8230003Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8230129Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8230391Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8230435Z return func(*args, **kwargs) 2025-12-04T15:04:54.8231709Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8231855Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8232079Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8232121Z return func(*args, **kwargs) 2025-12-04T15:04:54.8233393Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8233514Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8233738Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8233780Z return func(*args, **kwargs) 2025-12-04T15:04:54.8235029Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8235151Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8235374Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8235415Z return func(*args, **kwargs) 2025-12-04T15:04:54.8235634Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8235677Z return func(*args, **kwargs) 2025-12-04T15:04:54.8235896Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8235949Z return func(*args, **kwargs) 2025-12-04T15:04:54.8236184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8236225Z return func(*args, **kwargs) 2025-12-04T15:04:54.8236442Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8236482Z return func(*args, **kwargs) 2025-12-04T15:04:54.8236780Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8236820Z return func(*args, **kwargs) 2025-12-04T15:04:54.8236963Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8237124Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8237420Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8237573Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8237856Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8237979Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8238256Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8238406Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8238678Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8238827Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8239099Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8239237Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8239513Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8239661Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8240145Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 160256 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T15:04:54.8240424Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8240639Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8241000Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8241127Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8241337Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8241502Z [rank0]:E1204 14:57:48.214000 445653 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8241542Z dist init r=0, world=4 2025-12-04T15:04:54.8241700Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8241858Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8242141Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8242293Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8242575Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8242703Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8242978Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8243125Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8243399Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8243545Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8243820Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8243955Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8244230Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8244375Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8244854Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 160256 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T15:04:54.8244993Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8245186Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8245557Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8245670Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8245895Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8246059Z [rank3]:E1204 14:57:48.215000 445656 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8246098Z dist init r=3, world=4 2025-12-04T15:04:54.8246235Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8246394Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8246679Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8246833Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8247114Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8247237Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8247512Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8247658Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8247935Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8248079Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8248352Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8248487Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8248762Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8248930Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8249409Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T15:04:54.8249532Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8249724Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8250098Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8250246Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8250454Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8250619Z [rank1]:E1204 14:57:48.221000 445654 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8250657Z dist init r=1, world=4 2025-12-04T15:04:54.8250796Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8250953Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8251240Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8251391Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8251674Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8251796Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8252073Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8252219Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8252493Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8252637Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8252910Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8253066Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8253353Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8253499Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8253984Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T15:04:54.8254099Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8254305Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8254663Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8254776Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8254984Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8255148Z [rank2]:E1204 14:57:48.270000 445655 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8255188Z dist init r=2, world=4 2025-12-04T15:04:54.8255526Z [rank0]:[W1204 14:57:48.931072392 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8255564Z FAILED [9.5204s] [100%] 2025-12-04T15:04:54.8255566Z 2025-12-04T15:04:54.8255621Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8255720Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda __ 2025-12-04T15:04:54.8255769Z Traceback (most recent call last): 2025-12-04T15:04:54.8255929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8255975Z self._join_processes(fn) 2025-12-04T15:04:54.8256147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8256201Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8256378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8256420Z raise RuntimeError(error) 2025-12-04T15:04:54.8256499Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8256543Z Traceback (most recent call last): 2025-12-04T15:04:54.8256703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8256745Z getattr(self, test_name)() 2025-12-04T15:04:54.8256900Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8256945Z fn() 2025-12-04T15:04:54.8257106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8257146Z method(*args, **kwargs) 2025-12-04T15:04:54.8257297Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8257337Z method(*args, **kwargs) 2025-12-04T15:04:54.8257484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8257521Z with policy(): 2025-12-04T15:04:54.8257681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8257722Z raise RuntimeError(msg) 2025-12-04T15:04:54.8258076Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 160256 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T15:04:54.8258080Z 2025-12-04T15:04:54.8258166Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8258402Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8258404Z 2025-12-04T15:04:54.8258490Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8258493Z 2025-12-04T15:04:54.8258551Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8258596Z Traceback (most recent call last): 2025-12-04T15:04:54.8258757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8258800Z getattr(self, test_name)() 2025-12-04T15:04:54.8258955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8258991Z fn() 2025-12-04T15:04:54.8259141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8259180Z method(*args, **kwargs) 2025-12-04T15:04:54.8259327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8259368Z method(*args, **kwargs) 2025-12-04T15:04:54.8259519Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8259556Z with policy(): 2025-12-04T15:04:54.8259706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8259747Z raise RuntimeError(msg) 2025-12-04T15:04:54.8260101Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T15:04:54.8260104Z 2025-12-04T15:04:54.8260214Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8260445Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8260447Z 2025-12-04T15:04:54.8260535Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8260538Z 2025-12-04T15:04:54.8260539Z 2025-12-04T15:04:54.8260617Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8260719Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8260966Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6f0f25d8cb00bfe3.xml - 2025-12-04T15:04:54.8261026Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8261276Z FAILED [9.5204s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8261321Z Traceback (most recent call last): 2025-12-04T15:04:54.8261499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8261541Z getattr(self, test_name)() 2025-12-04T15:04:54.8261698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8261733Z fn() 2025-12-04T15:04:54.8261882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8261921Z method(*args, **kwargs) 2025-12-04T15:04:54.8262081Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8262121Z method(*args, **kwargs) 2025-12-04T15:04:54.8262270Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8262305Z with policy(): 2025-12-04T15:04:54.8262457Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8262496Z raise RuntimeError(msg) 2025-12-04T15:04:54.8262851Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 160256 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T15:04:54.8262854Z 2025-12-04T15:04:54.8262927Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8263161Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8263163Z 2025-12-04T15:04:54.8263250Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8263252Z 2025-12-04T15:04:54.8263308Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8263354Z Traceback (most recent call last): 2025-12-04T15:04:54.8263513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8263557Z getattr(self, test_name)() 2025-12-04T15:04:54.8263713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8263748Z fn() 2025-12-04T15:04:54.8263896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8263936Z method(*args, **kwargs) 2025-12-04T15:04:54.8264082Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8264121Z method(*args, **kwargs) 2025-12-04T15:04:54.8264268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8264305Z with policy(): 2025-12-04T15:04:54.8264453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8264504Z raise RuntimeError(msg) 2025-12-04T15:04:54.8264866Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T15:04:54.8264868Z 2025-12-04T15:04:54.8264944Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8265176Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8265180Z 2025-12-04T15:04:54.8265275Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8265341Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8265401Z ======================= 1 failed, 26 deselected in 9.67s ======================= 2025-12-04T15:04:54.8265440Z Got exit code 1 2025-12-04T15:04:54.8265622Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda 2025-12-04T15:04:54.8265764Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8265950Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-01f6ba787eac40cd.xml 2025-12-04T15:04:54.8266008Z ============================= test session starts ============================== 2025-12-04T15:04:54.8266119Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8266164Z cachedir: .pytest_cache 2025-12-04T15:04:54.8266320Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8266367Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8266407Z configfile: pytest.ini 2025-12-04T15:04:54.8266568Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8266641Z collecting ... collected 60 items / 15 deselected / 45 selected 2025-12-04T15:04:54.8266694Z stepcurrent: skipping 15 already run items. 2025-12-04T15:04:54.8266736Z Running 12 items in this shard 2025-12-04T15:04:54.8266738Z 2025-12-04T15:04:54.8267047Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda I1204 14:57:52.510000 445986 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 446055 2025-12-04T15:04:54.8267202Z I1204 14:57:52.511000 445986 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 446056 2025-12-04T15:04:54.8267352Z I1204 14:57:52.512000 445986 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 446057 2025-12-04T15:04:54.8267501Z I1204 14:57:52.512000 445986 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 446058 2025-12-04T15:04:54.8268078Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8268116Z _warn_cpu_init() 2025-12-04T15:04:54.8268680Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8268738Z _warn_cpu_init() 2025-12-04T15:04:54.8269312Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8269350Z _warn_cpu_init() 2025-12-04T15:04:54.8269921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8269959Z _warn_cpu_init() 2025-12-04T15:04:54.8270285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8270326Z return func(*args, **kwargs) 2025-12-04T15:04:54.8270470Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8270631Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8270919Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8271074Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8271362Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8271493Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8271769Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8271918Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8272193Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8272340Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8272614Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8272750Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8273040Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8273199Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8273693Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.8273807Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8274001Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8274370Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8274484Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8274697Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8274859Z [rank0]:E1204 14:58:00.201000 446055 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8274899Z dist init r=0, world=4 2025-12-04T15:04:54.8275035Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8275196Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8275479Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8275630Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8275914Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8276037Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8276312Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8276459Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8276732Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8276877Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8277150Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8277306Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8277581Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8277729Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8278212Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T15:04:54.8278337Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8278532Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8278888Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8278999Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8279208Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8279372Z [rank2]:E1204 14:58:00.205000 446057 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8279414Z dist init r=2, world=4 2025-12-04T15:04:54.8279550Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8279708Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8279992Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8280144Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8280462Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8280584Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8280861Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8281007Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8281281Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8281450Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8281725Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8281859Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8282155Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8282302Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8282788Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.8282901Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8283092Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8283446Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8283559Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8283768Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8283931Z [rank3]:E1204 14:58:00.212000 446058 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8283969Z dist init r=3, world=4 2025-12-04T15:04:54.8284105Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8284262Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8284545Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8284699Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8284983Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8285104Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8285392Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8285938Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8286437Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8286903Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8287386Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8287847Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8288301Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8288771Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8289437Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.8290054Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8290431Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8291017Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8291519Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8291876Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8292283Z [rank1]:E1204 14:58:00.215000 446056 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8292528Z dist init r=1, world=4 2025-12-04T15:04:54.8292968Z [rank0]:[W1204 14:58:00.932631228 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8293374Z FAILED [9.6208s] [ 8%] 2025-12-04T15:04:54.8293437Z 2025-12-04T15:04:54.8293495Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8293700Z ___ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda ____ 2025-12-04T15:04:54.8293890Z Traceback (most recent call last): 2025-12-04T15:04:54.8294138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8294379Z self._join_processes(fn) 2025-12-04T15:04:54.8294621Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8294899Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8295163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8295446Z raise RuntimeError(error) 2025-12-04T15:04:54.8295634Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8295794Z Traceback (most recent call last): 2025-12-04T15:04:54.8296040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8300251Z getattr(self, test_name)() 2025-12-04T15:04:54.8300508Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8300782Z fn() 2025-12-04T15:04:54.8301022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8301415Z method(*args, **kwargs) 2025-12-04T15:04:54.8301673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8301903Z method(*args, **kwargs) 2025-12-04T15:04:54.8302147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8302371Z with policy(): 2025-12-04T15:04:54.8302578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8302805Z raise RuntimeError(msg) 2025-12-04T15:04:54.8303231Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.8303625Z 2025-12-04T15:04:54.8303702Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8304041Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8304306Z 2025-12-04T15:04:54.8304395Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8304517Z 2025-12-04T15:04:54.8304519Z 2025-12-04T15:04:54.8304599Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8304797Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8305159Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-01f6ba787eac40cd.xml - 2025-12-04T15:04:54.8305482Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8305827Z FAILED [9.6208s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8306152Z Traceback (most recent call last): 2025-12-04T15:04:54.8306398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8306637Z getattr(self, test_name)() 2025-12-04T15:04:54.8306865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8307092Z fn() 2025-12-04T15:04:54.8307291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8307514Z method(*args, **kwargs) 2025-12-04T15:04:54.8307728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8307982Z method(*args, **kwargs) 2025-12-04T15:04:54.8308212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8308431Z with policy(): 2025-12-04T15:04:54.8308639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8308863Z raise RuntimeError(msg) 2025-12-04T15:04:54.8309280Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.8309677Z 2025-12-04T15:04:54.8309751Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8310094Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8310397Z 2025-12-04T15:04:54.8310484Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8310686Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8310849Z ======================= 1 failed, 15 deselected in 9.78s ======================= 2025-12-04T15:04:54.8310984Z Got exit code 1 2025-12-04T15:04:54.8311077Z Retrying single test... 2025-12-04T15:04:54.8311328Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ea26b455ddcffa84.xml 2025-12-04T15:04:54.8311610Z ============================= test session starts ============================== 2025-12-04T15:04:54.8311818Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8312004Z cachedir: .pytest_cache 2025-12-04T15:04:54.8312223Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8312459Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8312574Z configfile: pytest.ini 2025-12-04T15:04:54.8312800Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8313071Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8313401Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8313696Z Running 1 items in this shard 2025-12-04T15:04:54.8313771Z 2025-12-04T15:04:54.8314078Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda I1204 14:58:04.652000 446388 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 446457 2025-12-04T15:04:54.8314574Z I1204 14:58:04.653000 446388 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 446458 2025-12-04T15:04:54.8314917Z I1204 14:58:04.654000 446388 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 446459 2025-12-04T15:04:54.8315252Z I1204 14:58:04.654000 446388 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 446460 2025-12-04T15:04:54.8316017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8316672Z _warn_cpu_init() 2025-12-04T15:04:54.8317315Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8317945Z _warn_cpu_init() 2025-12-04T15:04:54.8318569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8319205Z _warn_cpu_init() 2025-12-04T15:04:54.8319829Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8320491Z _warn_cpu_init() 2025-12-04T15:04:54.8320838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8321200Z return func(*args, **kwargs) 2025-12-04T15:04:54.8321413Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8321750Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8322233Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8322711Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8323186Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8323627Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8324062Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8324519Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8324980Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8325437Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8325895Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8326367Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8326812Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8327267Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8327941Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.8328564Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8328921Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8329509Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8330011Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8330408Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8330818Z [rank0]:E1204 14:58:12.355000 446457 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8331156Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8331489Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8331981Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8332462Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8332934Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8333375Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8333807Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8334262Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8334724Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8335196Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8335671Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8336117Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8336578Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8337035Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8337704Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.8338321Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8338666Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8339253Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8339752Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8340110Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8340543Z [rank3]:E1204 14:58:12.355000 446460 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8340778Z dist init r=0, world=4 2025-12-04T15:04:54.8340877Z dist init r=3, world=4 2025-12-04T15:04:54.8341073Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8341409Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8341887Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8342361Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8342830Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8343272Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8343705Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8344175Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8344648Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8345106Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8345560Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8346033Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8346484Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8346956Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8347611Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T15:04:54.8348227Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8348568Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8349155Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8349656Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8350012Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8350452Z [rank2]:E1204 14:58:12.409000 446459 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8350691Z dist init r=2, world=4 2025-12-04T15:04:54.8350890Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8351222Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8351702Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8352174Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8352646Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8353090Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8353544Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8354021Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8354477Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8354951Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8355406Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8355852Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8356314Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8356777Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8357433Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.8358050Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8358391Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8358978Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8359483Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8359841Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8360274Z [rank1]:E1204 14:58:12.415000 446458 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8360511Z dist init r=1, world=4 2025-12-04T15:04:54.8360907Z [rank0]:[W1204 14:58:12.129203480 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8361310Z FAILED [9.8181s] [100%] 2025-12-04T15:04:54.8361375Z 2025-12-04T15:04:54.8361432Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8361625Z ___ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda ____ 2025-12-04T15:04:54.8361804Z Traceback (most recent call last): 2025-12-04T15:04:54.8362043Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8362296Z self._join_processes(fn) 2025-12-04T15:04:54.8362554Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8362814Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8363077Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8363330Z raise RuntimeError(error) 2025-12-04T15:04:54.8363487Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8363644Z Traceback (most recent call last): 2025-12-04T15:04:54.8363894Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8364131Z getattr(self, test_name)() 2025-12-04T15:04:54.8364357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8364586Z fn() 2025-12-04T15:04:54.8364785Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8365022Z method(*args, **kwargs) 2025-12-04T15:04:54.8365239Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8365462Z method(*args, **kwargs) 2025-12-04T15:04:54.8365675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8365895Z with policy(): 2025-12-04T15:04:54.8366104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8366330Z raise RuntimeError(msg) 2025-12-04T15:04:54.8366749Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.8367135Z 2025-12-04T15:04:54.8367211Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8367550Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8367814Z 2025-12-04T15:04:54.8367902Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8368024Z 2025-12-04T15:04:54.8368025Z 2025-12-04T15:04:54.8368104Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8368303Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8368656Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ea26b455ddcffa84.xml - 2025-12-04T15:04:54.8368981Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8369325Z FAILED [9.8181s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8369648Z Traceback (most recent call last): 2025-12-04T15:04:54.8369885Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8370124Z getattr(self, test_name)() 2025-12-04T15:04:54.8370385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8370612Z fn() 2025-12-04T15:04:54.8370823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8371063Z method(*args, **kwargs) 2025-12-04T15:04:54.8371282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8371505Z method(*args, **kwargs) 2025-12-04T15:04:54.8371717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8371936Z with policy(): 2025-12-04T15:04:54.8372145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8372370Z raise RuntimeError(msg) 2025-12-04T15:04:54.8372804Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.8373191Z 2025-12-04T15:04:54.8373264Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8373619Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8373884Z 2025-12-04T15:04:54.8373971Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8374154Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8374315Z ======================= 1 failed, 26 deselected in 9.97s ======================= 2025-12-04T15:04:54.8374455Z Got exit code 1 2025-12-04T15:04:54.8374549Z Retrying single test... 2025-12-04T15:04:54.8374796Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-410d8ae70e809520.xml 2025-12-04T15:04:54.8375072Z ============================= test session starts ============================== 2025-12-04T15:04:54.8375281Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8375466Z cachedir: .pytest_cache 2025-12-04T15:04:54.8375684Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8375917Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8376032Z configfile: pytest.ini 2025-12-04T15:04:54.8376252Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8376523Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8376851Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8377146Z Running 1 items in this shard 2025-12-04T15:04:54.8377219Z 2025-12-04T15:04:54.8377527Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda I1204 14:58:16.986000 446790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 446859 2025-12-04T15:04:54.8378017Z I1204 14:58:16.986000 446790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 446860 2025-12-04T15:04:54.8378354Z I1204 14:58:16.987000 446790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 446861 2025-12-04T15:04:54.8378693Z I1204 14:58:16.988000 446790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 446862 2025-12-04T15:04:54.8379455Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8380119Z _warn_cpu_init() 2025-12-04T15:04:54.8380786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8381424Z _warn_cpu_init() 2025-12-04T15:04:54.8382059Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8382691Z _warn_cpu_init() 2025-12-04T15:04:54.8383308Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8383940Z _warn_cpu_init() 2025-12-04T15:04:54.8384283Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8384649Z return func(*args, **kwargs) 2025-12-04T15:04:54.8384861Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8385196Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8385684Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8386156Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8386630Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8387072Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8387511Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8387972Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8388434Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8388915Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8389370Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8389813Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8390302Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8390760Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8391434Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.8393845Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8394195Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8394787Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8395292Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8395650Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8396059Z [rank1]:E1204 14:58:24.692000 446860 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8396320Z dist init r=1, world=4 2025-12-04T15:04:54.8396520Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8396854Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8397336Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8397812Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8398287Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8398727Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8399161Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8399642Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8400115Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8400618Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8401073Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8401535Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8401984Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8402447Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8403163Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T15:04:54.8403783Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8404124Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8404712Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8405211Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8405569Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8405976Z [rank0]:E1204 14:58:24.703000 446859 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8406221Z dist init r=0, world=4 2025-12-04T15:04:54.8406424Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8406769Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8407252Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8407726Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8408205Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8408653Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8409116Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8409592Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8410055Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8410579Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8411047Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8411498Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8411945Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8412422Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8413078Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T15:04:54.8413695Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8414037Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8414623Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8415123Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8415481Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8415892Z [rank3]:E1204 14:58:24.709000 446862 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8416139Z dist init r=3, world=4 2025-12-04T15:04:54.8416339Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8416674Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8417160Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8417639Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8418112Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8418614Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8419047Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8419519Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8420001Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8420494Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8420958Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8421406Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8421885Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8422358Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8423022Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T15:04:54.8423644Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8423984Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8424571Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8425073Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8425436Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8425848Z [rank2]:E1204 14:58:24.714000 446861 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8426091Z dist init r=2, world=4 2025-12-04T15:04:54.8426495Z [rank0]:[W1204 14:58:24.447836604 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8426905Z FAILED [9.6204s] [100%] 2025-12-04T15:04:54.8426973Z 2025-12-04T15:04:54.8427032Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8427245Z ___ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda ____ 2025-12-04T15:04:54.8427446Z Traceback (most recent call last): 2025-12-04T15:04:54.8427693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8427937Z self._join_processes(fn) 2025-12-04T15:04:54.8428186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8428454Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8428740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8428999Z raise RuntimeError(error) 2025-12-04T15:04:54.8429147Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8429303Z Traceback (most recent call last): 2025-12-04T15:04:54.8429539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8429776Z getattr(self, test_name)() 2025-12-04T15:04:54.8430002Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8430256Z fn() 2025-12-04T15:04:54.8430472Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8430699Z method(*args, **kwargs) 2025-12-04T15:04:54.8430915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8431141Z method(*args, **kwargs) 2025-12-04T15:04:54.8431353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8431576Z with policy(): 2025-12-04T15:04:54.8431782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8432012Z raise RuntimeError(msg) 2025-12-04T15:04:54.8432428Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.8432816Z 2025-12-04T15:04:54.8432890Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8433228Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8433496Z 2025-12-04T15:04:54.8433585Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8433710Z 2025-12-04T15:04:54.8433712Z 2025-12-04T15:04:54.8433792Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8433992Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8434342Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-410d8ae70e809520.xml - 2025-12-04T15:04:54.8434665Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8435006Z FAILED [9.6204s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8435328Z Traceback (most recent call last): 2025-12-04T15:04:54.8435569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8435822Z getattr(self, test_name)() 2025-12-04T15:04:54.8436063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8436290Z fn() 2025-12-04T15:04:54.8436484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8436709Z method(*args, **kwargs) 2025-12-04T15:04:54.8436923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8437145Z method(*args, **kwargs) 2025-12-04T15:04:54.8437374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8437595Z with policy(): 2025-12-04T15:04:54.8437803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8438029Z raise RuntimeError(msg) 2025-12-04T15:04:54.8438446Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T15:04:54.8438827Z 2025-12-04T15:04:54.8438901Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8439255Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8439519Z 2025-12-04T15:04:54.8439609Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8439792Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8439955Z ======================= 1 failed, 26 deselected in 9.78s ======================= 2025-12-04T15:04:54.8440091Z Got exit code 1 2025-12-04T15:04:54.8440369Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda 2025-12-04T15:04:54.8440702Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8441046Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-08e84758e21829c0.xml 2025-12-04T15:04:54.8441323Z ============================= test session starts ============================== 2025-12-04T15:04:54.8441528Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8441716Z cachedir: .pytest_cache 2025-12-04T15:04:54.8441933Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8442168Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8442284Z configfile: pytest.ini 2025-12-04T15:04:54.8442504Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8442772Z collecting ... collected 60 items / 16 deselected / 44 selected 2025-12-04T15:04:54.8442930Z stepcurrent: skipping 16 already run items. 2025-12-04T15:04:54.8443056Z Running 11 items in this shard 2025-12-04T15:04:54.8443125Z 2025-12-04T15:04:54.8443484Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 14:58:29.176000 447192 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 447261 2025-12-04T15:04:54.8444021Z I1204 14:58:29.176000 447192 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 447262 2025-12-04T15:04:54.8444373Z I1204 14:58:29.177000 447192 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 447263 2025-12-04T15:04:54.8444726Z I1204 14:58:29.177000 447192 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 447264 2025-12-04T15:04:54.8445202Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8445572Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8446241Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8446885Z _warn_cpu_init() 2025-12-04T15:04:54.8447229Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8447629Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8448041Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8448410Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8449062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8449694Z _warn_cpu_init() 2025-12-04T15:04:54.8450034Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8450484Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8450874Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8451238Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8451887Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8452523Z _warn_cpu_init() 2025-12-04T15:04:54.8452863Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8453257Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8453648Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8454048Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8454695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8455329Z _warn_cpu_init() 2025-12-04T15:04:54.8455685Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8456078Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8456414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8456716Z return func(*args, **kwargs) 2025-12-04T15:04:54.8457005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8457317Z return func(*args, **kwargs) 2025-12-04T15:04:54.8457606Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8457902Z return func(*args, **kwargs) 2025-12-04T15:04:54.8458189Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8458484Z return func(*args, **kwargs) 2025-12-04T15:04:54.8458767Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8459060Z return func(*args, **kwargs) 2025-12-04T15:04:54.8459342Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8459636Z return func(*args, **kwargs) 2025-12-04T15:04:54.8459920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8460246Z return func(*args, **kwargs) 2025-12-04T15:04:54.8460531Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8460827Z return func(*args, **kwargs) 2025-12-04T15:04:54.8461182Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8461542Z return func(*args, **kwargs) 2025-12-04T15:04:54.8461751Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8462089Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8462569Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8463072Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8463546Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8463989Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8464437Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8464895Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8465361Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8465818Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8466291Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8466738Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8467188Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8467646Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8468357Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8469027Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8469372Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8470010Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8470597Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8470956Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8471366Z [rank3]:E1204 14:58:34.971000 447264 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8471603Z dist init r=3, world=4 2025-12-04T15:04:54.8471804Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8472150Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8472647Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8473120Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8473609Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8474048Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8474483Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8474943Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8475416Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8475872Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8476329Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8476776Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8477224Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8477684Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8478400Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T15:04:54.8478516Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8478710Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8479119Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8479234Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8479444Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8479618Z [rank2]:E1204 14:58:34.976000 447263 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8479674Z dist init r=2, world=4 2025-12-04T15:04:54.8479811Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8479968Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8480300Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8480452Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8480738Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8480865Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8481166Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8481313Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8481587Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8481736Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8482013Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8482148Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8482425Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8482572Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8483095Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T15:04:54.8483209Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8483405Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8483809Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8483945Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8484154Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8484319Z [rank1]:E1204 14:58:34.984000 447262 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8484358Z dist init r=1, world=4 2025-12-04T15:04:54.8484495Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8484663Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8484945Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8485099Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8485396Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8485519Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8485796Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8485944Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8486219Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8486363Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8486639Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8486773Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8487050Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8487197Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8487718Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T15:04:54.8487831Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8488022Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8488447Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8488559Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8488778Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8488941Z [rank0]:E1204 14:58:35.030000 447261 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8488980Z dist init r=0, world=4 2025-12-04T15:04:54.8489314Z [rank0]:[W1204 14:58:35.897331924 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8489357Z FAILED [7.6172s] [ 9%] 2025-12-04T15:04:54.8489359Z 2025-12-04T15:04:54.8489415Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8489570Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _ 2025-12-04T15:04:54.8489617Z Traceback (most recent call last): 2025-12-04T15:04:54.8489778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8489825Z self._join_processes(fn) 2025-12-04T15:04:54.8489996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8490051Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8490271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8490315Z raise RuntimeError(error) 2025-12-04T15:04:54.8490394Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8490442Z Traceback (most recent call last): 2025-12-04T15:04:54.8490601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8490644Z getattr(self, test_name)() 2025-12-04T15:04:54.8490801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8490837Z fn() 2025-12-04T15:04:54.8490987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8491029Z method(*args, **kwargs) 2025-12-04T15:04:54.8491178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8491218Z method(*args, **kwargs) 2025-12-04T15:04:54.8491365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8491402Z with policy(): 2025-12-04T15:04:54.8491553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8491594Z raise RuntimeError(msg) 2025-12-04T15:04:54.8491992Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T15:04:54.8492021Z 2025-12-04T15:04:54.8492098Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8492378Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8492382Z 2025-12-04T15:04:54.8492469Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8492472Z 2025-12-04T15:04:54.8492530Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8492575Z Traceback (most recent call last): 2025-12-04T15:04:54.8492754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8492795Z getattr(self, test_name)() 2025-12-04T15:04:54.8492953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8492989Z fn() 2025-12-04T15:04:54.8493138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8493177Z method(*args, **kwargs) 2025-12-04T15:04:54.8493327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8493382Z method(*args, **kwargs) 2025-12-04T15:04:54.8493531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8493567Z with policy(): 2025-12-04T15:04:54.8493719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8493759Z raise RuntimeError(msg) 2025-12-04T15:04:54.8494160Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8494163Z 2025-12-04T15:04:54.8494236Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8494515Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8494517Z 2025-12-04T15:04:54.8494605Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8494607Z 2025-12-04T15:04:54.8494609Z 2025-12-04T15:04:54.8494684Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8494773Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8495004Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-08e84758e21829c0.xml - 2025-12-04T15:04:54.8495065Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8495360Z FAILED [7.6172s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8495408Z Traceback (most recent call last): 2025-12-04T15:04:54.8495570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8495612Z getattr(self, test_name)() 2025-12-04T15:04:54.8495769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8495829Z fn() 2025-12-04T15:04:54.8495977Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8496018Z method(*args, **kwargs) 2025-12-04T15:04:54.8496167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8496206Z method(*args, **kwargs) 2025-12-04T15:04:54.8496354Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8496392Z with policy(): 2025-12-04T15:04:54.8496558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8496599Z raise RuntimeError(msg) 2025-12-04T15:04:54.8496999Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T15:04:54.8497002Z 2025-12-04T15:04:54.8497075Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8497362Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8497364Z 2025-12-04T15:04:54.8497450Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8497453Z 2025-12-04T15:04:54.8497511Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8497556Z Traceback (most recent call last): 2025-12-04T15:04:54.8497716Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8497759Z getattr(self, test_name)() 2025-12-04T15:04:54.8497915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8497949Z fn() 2025-12-04T15:04:54.8498097Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8498136Z method(*args, **kwargs) 2025-12-04T15:04:54.8498286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8498324Z method(*args, **kwargs) 2025-12-04T15:04:54.8498473Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8498509Z with policy(): 2025-12-04T15:04:54.8498658Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8498699Z raise RuntimeError(msg) 2025-12-04T15:04:54.8499096Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8499098Z 2025-12-04T15:04:54.8499172Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8499449Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8499451Z 2025-12-04T15:04:54.8499538Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8499610Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8499687Z ======================= 1 failed, 16 deselected in 7.78s ======================= 2025-12-04T15:04:54.8499724Z Got exit code 1 2025-12-04T15:04:54.8499765Z Retrying single test... 2025-12-04T15:04:54.8499951Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1c574f164cdba76c.xml 2025-12-04T15:04:54.8500009Z ============================= test session starts ============================== 2025-12-04T15:04:54.8500120Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8500162Z cachedir: .pytest_cache 2025-12-04T15:04:54.8500375Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8500422Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8500462Z configfile: pytest.ini 2025-12-04T15:04:54.8500624Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8500699Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8500973Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8501034Z Running 1 items in this shard 2025-12-04T15:04:54.8501036Z 2025-12-04T15:04:54.8501395Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 14:58:39.290000 447594 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 447663 2025-12-04T15:04:54.8501560Z I1204 14:58:39.290000 447594 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 447664 2025-12-04T15:04:54.8501711Z I1204 14:58:39.291000 447594 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 447665 2025-12-04T15:04:54.8501860Z I1204 14:58:39.291000 447594 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 447666 2025-12-04T15:04:54.8502154Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8502205Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8502775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8502815Z _warn_cpu_init() 2025-12-04T15:04:54.8503102Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8503150Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8503714Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8503765Z _warn_cpu_init() 2025-12-04T15:04:54.8504066Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8504143Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8504428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8504501Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8504794Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8504844Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8505421Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8505459Z _warn_cpu_init() 2025-12-04T15:04:54.8505743Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8505817Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8506099Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8506149Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8506713Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8506751Z _warn_cpu_init() 2025-12-04T15:04:54.8507033Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8507106Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8507333Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8507375Z return func(*args, **kwargs) 2025-12-04T15:04:54.8507597Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8507638Z return func(*args, **kwargs) 2025-12-04T15:04:54.8507864Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8507905Z return func(*args, **kwargs) 2025-12-04T15:04:54.8508129Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8508197Z return func(*args, **kwargs) 2025-12-04T15:04:54.8508414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8508453Z return func(*args, **kwargs) 2025-12-04T15:04:54.8508672Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8508711Z return func(*args, **kwargs) 2025-12-04T15:04:54.8508941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8508982Z return func(*args, **kwargs) 2025-12-04T15:04:54.8509201Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8509243Z return func(*args, **kwargs) 2025-12-04T15:04:54.8509531Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8509590Z return func(*args, **kwargs) 2025-12-04T15:04:54.8509733Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8509896Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8510216Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8510372Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8510654Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8510779Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8511054Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8511200Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8511480Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8511627Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8511904Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8512040Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8512316Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8512493Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8513019Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T15:04:54.8513147Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8513339Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8513748Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8513861Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8514085Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8514249Z [rank0]:E1204 14:58:44.996000 447663 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8514288Z dist init r=0, world=4 2025-12-04T15:04:54.8514431Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8514591Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8514877Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8515029Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8515312Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8515435Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8515712Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8515857Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8516134Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8516281Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8516556Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8516710Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8516983Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8517130Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8517662Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T15:04:54.8517778Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8517971Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8518388Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8518501Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8518709Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8518875Z [rank2]:E1204 14:58:44.997000 447665 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8518913Z dist init r=2, world=4 2025-12-04T15:04:54.8519051Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8519209Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8519494Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8519647Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8519931Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8520054Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8520357Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8520505Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8520779Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8520951Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8521225Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8521360Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8521646Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8521793Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8522315Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8522440Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8522635Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8523041Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8523155Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8523364Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8523526Z [rank3]:E1204 14:58:45.002000 447666 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8523564Z dist init r=3, world=4 2025-12-04T15:04:54.8523701Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8523859Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8524143Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8524297Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8524578Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8524701Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8524977Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8525142Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8525416Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8525560Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8525844Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8525979Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8526256Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8526403Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8526937Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T15:04:54.8527050Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8527243Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8527646Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8527758Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8527967Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8528128Z [rank1]:E1204 14:58:45.064000 447664 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8528169Z dist init r=1, world=4 2025-12-04T15:04:54.8528500Z [rank0]:[W1204 14:58:45.680703230 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8528540Z FAILED [7.3165s] [100%] 2025-12-04T15:04:54.8528542Z 2025-12-04T15:04:54.8528599Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8528742Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _ 2025-12-04T15:04:54.8528788Z Traceback (most recent call last): 2025-12-04T15:04:54.8528949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8528994Z self._join_processes(fn) 2025-12-04T15:04:54.8529182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8529246Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8529421Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8529465Z raise RuntimeError(error) 2025-12-04T15:04:54.8529542Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8529587Z Traceback (most recent call last): 2025-12-04T15:04:54.8529746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8529799Z getattr(self, test_name)() 2025-12-04T15:04:54.8529955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8529991Z fn() 2025-12-04T15:04:54.8530140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8530215Z method(*args, **kwargs) 2025-12-04T15:04:54.8530365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8530406Z method(*args, **kwargs) 2025-12-04T15:04:54.8530570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8530608Z with policy(): 2025-12-04T15:04:54.8530758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8530800Z raise RuntimeError(msg) 2025-12-04T15:04:54.8531200Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T15:04:54.8531204Z 2025-12-04T15:04:54.8531278Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8531557Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8531560Z 2025-12-04T15:04:54.8531647Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8531650Z 2025-12-04T15:04:54.8531708Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8531754Z Traceback (most recent call last): 2025-12-04T15:04:54.8531916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8531958Z getattr(self, test_name)() 2025-12-04T15:04:54.8532116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8532150Z fn() 2025-12-04T15:04:54.8532300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8532339Z method(*args, **kwargs) 2025-12-04T15:04:54.8532488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8532527Z method(*args, **kwargs) 2025-12-04T15:04:54.8532677Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8532713Z with policy(): 2025-12-04T15:04:54.8532864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8532918Z raise RuntimeError(msg) 2025-12-04T15:04:54.8533328Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8533331Z 2025-12-04T15:04:54.8533403Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8533680Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8533694Z 2025-12-04T15:04:54.8533782Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8533784Z 2025-12-04T15:04:54.8533786Z 2025-12-04T15:04:54.8533862Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8533950Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8534180Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1c574f164cdba76c.xml - 2025-12-04T15:04:54.8534241Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8534545Z FAILED [7.3165s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8534591Z Traceback (most recent call last): 2025-12-04T15:04:54.8534753Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8534795Z getattr(self, test_name)() 2025-12-04T15:04:54.8534953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8534989Z fn() 2025-12-04T15:04:54.8535137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8535177Z method(*args, **kwargs) 2025-12-04T15:04:54.8535325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8535365Z method(*args, **kwargs) 2025-12-04T15:04:54.8535512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8535550Z with policy(): 2025-12-04T15:04:54.8535698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8535740Z raise RuntimeError(msg) 2025-12-04T15:04:54.8536135Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T15:04:54.8536138Z 2025-12-04T15:04:54.8536210Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8536488Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8536490Z 2025-12-04T15:04:54.8536576Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8536578Z 2025-12-04T15:04:54.8536636Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8536691Z Traceback (most recent call last): 2025-12-04T15:04:54.8536852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8536903Z getattr(self, test_name)() 2025-12-04T15:04:54.8537059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8537093Z fn() 2025-12-04T15:04:54.8537242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8537280Z method(*args, **kwargs) 2025-12-04T15:04:54.8537433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8537482Z method(*args, **kwargs) 2025-12-04T15:04:54.8537631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8537668Z with policy(): 2025-12-04T15:04:54.8537818Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8537859Z raise RuntimeError(msg) 2025-12-04T15:04:54.8538267Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8538270Z 2025-12-04T15:04:54.8538344Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8538619Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8538622Z 2025-12-04T15:04:54.8538708Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8538771Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8538833Z ======================= 1 failed, 26 deselected in 7.48s ======================= 2025-12-04T15:04:54.8538869Z Got exit code 1 2025-12-04T15:04:54.8538910Z Retrying single test... 2025-12-04T15:04:54.8539098Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6fbd78b0e9190a34.xml 2025-12-04T15:04:54.8539156Z ============================= test session starts ============================== 2025-12-04T15:04:54.8539267Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8539308Z cachedir: .pytest_cache 2025-12-04T15:04:54.8539463Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8539509Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8539550Z configfile: pytest.ini 2025-12-04T15:04:54.8539710Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8539783Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8540056Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8540099Z Running 1 items in this shard 2025-12-04T15:04:54.8540104Z 2025-12-04T15:04:54.8540485Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 14:58:49.178000 447996 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 448065 2025-12-04T15:04:54.8540658Z I1204 14:58:49.179000 447996 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 448066 2025-12-04T15:04:54.8540826Z I1204 14:58:49.179000 447996 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 448067 2025-12-04T15:04:54.8540980Z I1204 14:58:49.180000 447996 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 448068 2025-12-04T15:04:54.8545080Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8545142Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8545764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8545807Z _warn_cpu_init() 2025-12-04T15:04:54.8546102Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8546196Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8546485Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8546538Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8547114Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8547151Z _warn_cpu_init() 2025-12-04T15:04:54.8547437Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8547514Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8547796Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8547846Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8548411Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8548447Z _warn_cpu_init() 2025-12-04T15:04:54.8548732Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8548806Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8549114Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8549161Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8549737Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8549773Z _warn_cpu_init() 2025-12-04T15:04:54.8550057Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8550130Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8550401Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8550444Z return func(*args, **kwargs) 2025-12-04T15:04:54.8550680Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8550721Z return func(*args, **kwargs) 2025-12-04T15:04:54.8550942Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8550983Z return func(*args, **kwargs) 2025-12-04T15:04:54.8551205Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8551246Z return func(*args, **kwargs) 2025-12-04T15:04:54.8551466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8551507Z return func(*args, **kwargs) 2025-12-04T15:04:54.8551724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8551764Z return func(*args, **kwargs) 2025-12-04T15:04:54.8551980Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8552020Z return func(*args, **kwargs) 2025-12-04T15:04:54.8552236Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8552275Z return func(*args, **kwargs) 2025-12-04T15:04:54.8552564Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8552603Z return func(*args, **kwargs) 2025-12-04T15:04:54.8552748Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8552911Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8553214Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8553384Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8553672Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8553795Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8554084Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8554232Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8554510Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8554665Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8554940Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8555075Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8555351Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8555499Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8556028Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8556144Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8556341Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8556751Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8556864Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8557076Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8557241Z [rank3]:E1204 14:58:54.943000 448068 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8557291Z dist init r=3, world=4 2025-12-04T15:04:54.8557440Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8557597Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8557884Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8558034Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8558326Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8558449Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8558726Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8558884Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8559159Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8559304Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8559578Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8559713Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8559987Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8560133Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8560697Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T15:04:54.8560813Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8561006Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8561413Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8561526Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8561755Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8561935Z [rank1]:E1204 14:58:54.996000 448066 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8561972Z dist init r=1, world=4 2025-12-04T15:04:54.8562111Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8562268Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8562567Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8562721Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8563005Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8563127Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8563412Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8563560Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8563833Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8563980Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8564256Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8564389Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8564664Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8564811Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8565338Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T15:04:54.8565450Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8565645Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8566050Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8566182Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8566391Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8566553Z [rank0]:E1204 14:58:55.004000 448065 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8566601Z dist init r=0, world=4 2025-12-04T15:04:54.8566742Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8566900Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8567185Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8567336Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8567630Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8567754Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8568030Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8568177Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8568452Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8568597Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8568872Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8569005Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8569280Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8569426Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8569952Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T15:04:54.8570076Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8570313Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8570720Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8570831Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8571054Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8571218Z [rank2]:E1204 14:58:55.014000 448067 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8571256Z dist init r=2, world=4 2025-12-04T15:04:54.8571593Z [rank0]:[W1204 14:58:55.821977225 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8571633Z FAILED [7.3196s] [100%] 2025-12-04T15:04:54.8571648Z 2025-12-04T15:04:54.8571707Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8571851Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _ 2025-12-04T15:04:54.8571897Z Traceback (most recent call last): 2025-12-04T15:04:54.8572058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8572103Z self._join_processes(fn) 2025-12-04T15:04:54.8572275Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8572328Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8572504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8572547Z raise RuntimeError(error) 2025-12-04T15:04:54.8572628Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8572672Z Traceback (most recent call last): 2025-12-04T15:04:54.8572831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8572873Z getattr(self, test_name)() 2025-12-04T15:04:54.8573028Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8573065Z fn() 2025-12-04T15:04:54.8573214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8573254Z method(*args, **kwargs) 2025-12-04T15:04:54.8573401Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8573441Z method(*args, **kwargs) 2025-12-04T15:04:54.8573589Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8573625Z with policy(): 2025-12-04T15:04:54.8573775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8573815Z raise RuntimeError(msg) 2025-12-04T15:04:54.8574215Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8574243Z 2025-12-04T15:04:54.8574318Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8574597Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8574599Z 2025-12-04T15:04:54.8574686Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8574699Z 2025-12-04T15:04:54.8574701Z 2025-12-04T15:04:54.8574780Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8574866Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8575100Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6fbd78b0e9190a34.xml - 2025-12-04T15:04:54.8575162Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8575479Z FAILED [7.3196s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8575525Z Traceback (most recent call last): 2025-12-04T15:04:54.8575687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8575730Z getattr(self, test_name)() 2025-12-04T15:04:54.8575887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8575924Z fn() 2025-12-04T15:04:54.8576074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8576115Z method(*args, **kwargs) 2025-12-04T15:04:54.8576263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8576302Z method(*args, **kwargs) 2025-12-04T15:04:54.8576449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8576486Z with policy(): 2025-12-04T15:04:54.8576635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8576676Z raise RuntimeError(msg) 2025-12-04T15:04:54.8577075Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T15:04:54.8577078Z 2025-12-04T15:04:54.8577152Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8577430Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8577432Z 2025-12-04T15:04:54.8577519Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8577582Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8577645Z ======================= 1 failed, 26 deselected in 7.46s ======================= 2025-12-04T15:04:54.8577681Z Got exit code 1 2025-12-04T15:04:54.8577919Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T15:04:54.8578057Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8578243Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-de1fe8b11ca18c31.xml 2025-12-04T15:04:54.8578302Z ============================= test session starts ============================== 2025-12-04T15:04:54.8578413Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8578455Z cachedir: .pytest_cache 2025-12-04T15:04:54.8578622Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8578668Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8578709Z configfile: pytest.ini 2025-12-04T15:04:54.8578871Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8578945Z collecting ... collected 60 items / 17 deselected / 43 selected 2025-12-04T15:04:54.8578996Z stepcurrent: skipping 17 already run items. 2025-12-04T15:04:54.8579039Z Running 10 items in this shard 2025-12-04T15:04:54.8579041Z 2025-12-04T15:04:54.8579412Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 14:58:59.170000 448398 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 448467 2025-12-04T15:04:54.8579566Z I1204 14:58:59.171000 448398 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 448468 2025-12-04T15:04:54.8579717Z I1204 14:58:59.172000 448398 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 448469 2025-12-04T15:04:54.8579866Z I1204 14:58:59.172000 448398 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 448470 2025-12-04T15:04:54.8580482Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8580522Z _warn_cpu_init() 2025-12-04T15:04:54.8581088Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8581126Z _warn_cpu_init() 2025-12-04T15:04:54.8581690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8581727Z _warn_cpu_init() 2025-12-04T15:04:54.8582288Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8582350Z _warn_cpu_init() 2025-12-04T15:04:54.8582641Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8582682Z return func(*args, **kwargs) 2025-12-04T15:04:54.8582836Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8582996Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8583286Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8583440Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8583741Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8583866Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8584142Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8584289Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8584563Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8584709Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8584983Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8585119Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8585394Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8585542Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8586067Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T15:04:54.8586181Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8586375Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8586799Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8586913Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8587134Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8587297Z [rank1]:E1204 14:59:04.992000 448468 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8587337Z dist init r=1, world=4 2025-12-04T15:04:54.8587476Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8587636Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8587929Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8588081Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8588366Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8588491Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8588768Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8588914Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8589190Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8589335Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8589608Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8589742Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8590018Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8590163Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8590720Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T15:04:54.8590860Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8591053Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8591478Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8591590Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8591800Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8591962Z [rank2]:E1204 14:59:05.004000 448469 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8592000Z dist init r=2, world=4 2025-12-04T15:04:54.8592134Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8592304Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8592589Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8592743Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8593027Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8593148Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8593422Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8593567Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8593847Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8593993Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8594267Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8594401Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8594676Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8594835Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8595361Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T15:04:54.8595474Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8595676Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8596075Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8596188Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8596395Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8596567Z [rank3]:E1204 14:59:05.032000 448470 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8596604Z dist init r=3, world=4 2025-12-04T15:04:54.8596741Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8596897Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8597184Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8597334Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8597617Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8597738Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8598014Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8598160Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8598432Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8598578Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8598852Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8598986Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8599281Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8599429Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8599963Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T15:04:54.8600075Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8600300Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8600715Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8600827Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8601037Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8601200Z [rank0]:E1204 14:59:05.064000 448467 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8601239Z dist init r=0, world=4 2025-12-04T15:04:54.8601578Z [rank0]:[W1204 14:59:05.845704688 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8601616Z FAILED [7.5169s] [ 10%] 2025-12-04T15:04:54.8601618Z 2025-12-04T15:04:54.8601673Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8601811Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _ 2025-12-04T15:04:54.8601857Z Traceback (most recent call last): 2025-12-04T15:04:54.8602018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8602061Z self._join_processes(fn) 2025-12-04T15:04:54.8602233Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8602286Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8602462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8602504Z raise RuntimeError(error) 2025-12-04T15:04:54.8602584Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8602627Z Traceback (most recent call last): 2025-12-04T15:04:54.8602788Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8602829Z getattr(self, test_name)() 2025-12-04T15:04:54.8602986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8603032Z fn() 2025-12-04T15:04:54.8603194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8603234Z method(*args, **kwargs) 2025-12-04T15:04:54.8603382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8603420Z method(*args, **kwargs) 2025-12-04T15:04:54.8603569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8603604Z with policy(): 2025-12-04T15:04:54.8603768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8603809Z raise RuntimeError(msg) 2025-12-04T15:04:54.8604201Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T15:04:54.8604205Z 2025-12-04T15:04:54.8604282Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8604567Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8604570Z 2025-12-04T15:04:54.8604657Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8604659Z 2025-12-04T15:04:54.8604717Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8604761Z Traceback (most recent call last): 2025-12-04T15:04:54.8604924Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8604965Z getattr(self, test_name)() 2025-12-04T15:04:54.8605123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8605157Z fn() 2025-12-04T15:04:54.8605307Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8605345Z method(*args, **kwargs) 2025-12-04T15:04:54.8605494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8605532Z method(*args, **kwargs) 2025-12-04T15:04:54.8605681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8605716Z with policy(): 2025-12-04T15:04:54.8605867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8605907Z raise RuntimeError(msg) 2025-12-04T15:04:54.8606302Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T15:04:54.8606305Z 2025-12-04T15:04:54.8606376Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8606782Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8606784Z 2025-12-04T15:04:54.8606871Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8606873Z 2025-12-04T15:04:54.8606890Z 2025-12-04T15:04:54.8606966Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8607073Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8607304Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-de1fe8b11ca18c31.xml - 2025-12-04T15:04:54.8607365Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8607657Z FAILED [7.5169s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8607714Z Traceback (most recent call last): 2025-12-04T15:04:54.8607876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8607919Z getattr(self, test_name)() 2025-12-04T15:04:54.8608076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8608112Z fn() 2025-12-04T15:04:54.8608261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8608301Z method(*args, **kwargs) 2025-12-04T15:04:54.8608459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8608499Z method(*args, **kwargs) 2025-12-04T15:04:54.8608646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8608685Z with policy(): 2025-12-04T15:04:54.8608833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8608875Z raise RuntimeError(msg) 2025-12-04T15:04:54.8609268Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T15:04:54.8609271Z 2025-12-04T15:04:54.8609343Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8609617Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8609619Z 2025-12-04T15:04:54.8609705Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8609708Z 2025-12-04T15:04:54.8609766Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8609810Z Traceback (most recent call last): 2025-12-04T15:04:54.8609970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8610011Z getattr(self, test_name)() 2025-12-04T15:04:54.8610197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8610230Z fn() 2025-12-04T15:04:54.8610380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8610418Z method(*args, **kwargs) 2025-12-04T15:04:54.8610566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8610606Z method(*args, **kwargs) 2025-12-04T15:04:54.8610754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8610805Z with policy(): 2025-12-04T15:04:54.8610957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8611012Z raise RuntimeError(msg) 2025-12-04T15:04:54.8611403Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T15:04:54.8611405Z 2025-12-04T15:04:54.8611477Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8611764Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8611766Z 2025-12-04T15:04:54.8611853Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8611917Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8611979Z ======================= 1 failed, 17 deselected in 7.67s ======================= 2025-12-04T15:04:54.8612014Z Got exit code 1 2025-12-04T15:04:54.8612054Z Retrying single test... 2025-12-04T15:04:54.8612240Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-04de444d9328f7d3.xml 2025-12-04T15:04:54.8612311Z ============================= test session starts ============================== 2025-12-04T15:04:54.8612422Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8612464Z cachedir: .pytest_cache 2025-12-04T15:04:54.8612620Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8612666Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8612706Z configfile: pytest.ini 2025-12-04T15:04:54.8612866Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8612939Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8613206Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8613249Z Running 1 items in this shard 2025-12-04T15:04:54.8613251Z 2025-12-04T15:04:54.8613598Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 14:59:09.114000 448800 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 448869 2025-12-04T15:04:54.8613753Z I1204 14:59:09.115000 448800 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 448870 2025-12-04T15:04:54.8613905Z I1204 14:59:09.115000 448800 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 448871 2025-12-04T15:04:54.8614055Z I1204 14:59:09.116000 448800 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 448872 2025-12-04T15:04:54.8614636Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8614673Z _warn_cpu_init() 2025-12-04T15:04:54.8615244Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8615295Z _warn_cpu_init() 2025-12-04T15:04:54.8615867Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8615905Z _warn_cpu_init() 2025-12-04T15:04:54.8616471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8616517Z _warn_cpu_init() 2025-12-04T15:04:54.8616809Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8616851Z return func(*args, **kwargs) 2025-12-04T15:04:54.8616992Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8617153Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8617440Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8617594Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8617877Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8618001Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8618277Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8618424Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8618701Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8618848Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8619123Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8619278Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8619552Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8619698Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8620256Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T15:04:54.8620373Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8620567Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8620985Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8621099Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8621309Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8621473Z [rank3]:E1204 14:59:14.916000 448872 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8621511Z dist init r=3, world=4 2025-12-04T15:04:54.8621646Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8621805Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8622090Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8622242Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8622526Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8622648Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8622926Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8623072Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8623345Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8623525Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8623798Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8623934Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8624216Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8624363Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8624878Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T15:04:54.8625001Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8625195Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8625597Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8625712Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8625920Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8626083Z [rank1]:E1204 14:59:14.917000 448870 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8626121Z dist init r=1, world=4 2025-12-04T15:04:54.8626262Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8626419Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8626705Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8626857Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8627141Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8627263Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8627538Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8627696Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8627981Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8628127Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8628412Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8628547Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8628821Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8628968Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8629498Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T15:04:54.8629613Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8629809Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8630238Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8630352Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8630561Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8630723Z [rank2]:E1204 14:59:14.983000 448871 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8630761Z dist init r=2, world=4 2025-12-04T15:04:54.8630896Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8631055Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8631341Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8631493Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8631777Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8631913Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8632201Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8632348Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8632624Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8632782Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8633055Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8633191Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8633478Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8633623Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8634139Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T15:04:54.8634251Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8634443Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8634845Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8634955Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8635164Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8635325Z [rank0]:E1204 14:59:14.994000 448869 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8635363Z dist init r=0, world=4 2025-12-04T15:04:54.8635695Z [rank0]:[W1204 14:59:15.874055444 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8635733Z FAILED [7.4174s] [100%] 2025-12-04T15:04:54.8635736Z 2025-12-04T15:04:54.8635791Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8635929Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _ 2025-12-04T15:04:54.8635985Z Traceback (most recent call last): 2025-12-04T15:04:54.8636161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8636204Z self._join_processes(fn) 2025-12-04T15:04:54.8636375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8636427Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8636605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8636648Z raise RuntimeError(error) 2025-12-04T15:04:54.8636736Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8636781Z Traceback (most recent call last): 2025-12-04T15:04:54.8636940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8636983Z getattr(self, test_name)() 2025-12-04T15:04:54.8637139Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8637174Z fn() 2025-12-04T15:04:54.8637323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8637363Z method(*args, **kwargs) 2025-12-04T15:04:54.8637526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8637566Z method(*args, **kwargs) 2025-12-04T15:04:54.8637714Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8637750Z with policy(): 2025-12-04T15:04:54.8637899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8637941Z raise RuntimeError(msg) 2025-12-04T15:04:54.8638332Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T15:04:54.8638334Z 2025-12-04T15:04:54.8638407Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8638680Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8638682Z 2025-12-04T15:04:54.8638768Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8638771Z 2025-12-04T15:04:54.8638773Z 2025-12-04T15:04:54.8638848Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8638935Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8639166Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-04de444d9328f7d3.xml - 2025-12-04T15:04:54.8639226Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8639517Z FAILED [7.4174s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8639563Z Traceback (most recent call last): 2025-12-04T15:04:54.8639727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8639780Z getattr(self, test_name)() 2025-12-04T15:04:54.8639949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8639982Z fn() 2025-12-04T15:04:54.8640131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8640202Z method(*args, **kwargs) 2025-12-04T15:04:54.8640353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8640391Z method(*args, **kwargs) 2025-12-04T15:04:54.8640551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8640587Z with policy(): 2025-12-04T15:04:54.8640737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8640776Z raise RuntimeError(msg) 2025-12-04T15:04:54.8641170Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T15:04:54.8641172Z 2025-12-04T15:04:54.8641245Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8641533Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8641535Z 2025-12-04T15:04:54.8641623Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8641686Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8641749Z ======================= 1 failed, 26 deselected in 7.57s ======================= 2025-12-04T15:04:54.8641785Z Got exit code 1 2025-12-04T15:04:54.8641824Z Retrying single test... 2025-12-04T15:04:54.8642010Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-75f7e19fe7d04b8e.xml 2025-12-04T15:04:54.8642066Z ============================= test session starts ============================== 2025-12-04T15:04:54.8642177Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8642218Z cachedir: .pytest_cache 2025-12-04T15:04:54.8642373Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8642418Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8642456Z configfile: pytest.ini 2025-12-04T15:04:54.8642617Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8642690Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8642957Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8642999Z Running 1 items in this shard 2025-12-04T15:04:54.8643002Z 2025-12-04T15:04:54.8643347Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 14:59:19.137000 449202 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449271 2025-12-04T15:04:54.8643500Z I1204 14:59:19.137000 449202 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449272 2025-12-04T15:04:54.8643649Z I1204 14:59:19.138000 449202 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 449273 2025-12-04T15:04:54.8643824Z I1204 14:59:19.138000 449202 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 449274 2025-12-04T15:04:54.8644399Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8644436Z _warn_cpu_init() 2025-12-04T15:04:54.8645009Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8645048Z _warn_cpu_init() 2025-12-04T15:04:54.8645625Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8645661Z _warn_cpu_init() 2025-12-04T15:04:54.8646224Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8646261Z _warn_cpu_init() 2025-12-04T15:04:54.8646550Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8646592Z return func(*args, **kwargs) 2025-12-04T15:04:54.8646733Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8646892Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8647178Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8647333Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8647615Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8647739Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8648013Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8648179Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8648453Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8648599Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8648885Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8649019Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8649295Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8649441Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8649970Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T15:04:54.8650083Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8650309Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8650714Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8650827Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8651038Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8651200Z [rank3]:E1204 14:59:24.871000 449274 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8651238Z dist init r=3, world=4 2025-12-04T15:04:54.8651375Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8651533Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8651819Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8651972Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8652255Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8652403Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8652678Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8652823Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8653115Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8653260Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8653537Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8653671Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8653960Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8654108Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8654623Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T15:04:54.8654737Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8654931Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8655330Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8655440Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8655650Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8655815Z [rank2]:E1204 14:59:24.925000 449273 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8655852Z dist init r=2, world=4 2025-12-04T15:04:54.8655992Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8656148Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8656432Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8656599Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8656897Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8657019Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8657307Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8657452Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8657726Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8657872Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8658154Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8658288Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8658562Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8658709Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8659227Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T15:04:54.8659339Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8659533Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8659932Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8660045Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8660320Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8660484Z [rank0]:E1204 14:59:24.932000 449271 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8660619Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8660777Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8661087Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8661238Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8661529Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8661673Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8661948Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8662095Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8662369Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8662532Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8662809Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8662943Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8663222Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8663370Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8663894Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T15:04:54.8664006Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8664199Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8664599Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8664711Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8664920Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8665081Z [rank1]:E1204 14:59:24.932000 449272 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8665142Z dist init r=0, world=4 2025-12-04T15:04:54.8665180Z dist init r=1, world=4 2025-12-04T15:04:54.8665515Z [rank0]:[W1204 14:59:25.710935643 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8665553Z FAILED [7.3183s] [100%] 2025-12-04T15:04:54.8665556Z 2025-12-04T15:04:54.8665611Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8665766Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _ 2025-12-04T15:04:54.8665812Z Traceback (most recent call last): 2025-12-04T15:04:54.8665973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8666016Z self._join_processes(fn) 2025-12-04T15:04:54.8666188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8666241Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8666417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8666458Z raise RuntimeError(error) 2025-12-04T15:04:54.8666549Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8666593Z Traceback (most recent call last): 2025-12-04T15:04:54.8666752Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8666793Z getattr(self, test_name)() 2025-12-04T15:04:54.8666949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8666985Z fn() 2025-12-04T15:04:54.8667133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8667173Z method(*args, **kwargs) 2025-12-04T15:04:54.8667320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8667359Z method(*args, **kwargs) 2025-12-04T15:04:54.8667507Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8667543Z with policy(): 2025-12-04T15:04:54.8667692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8667733Z raise RuntimeError(msg) 2025-12-04T15:04:54.8668125Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T15:04:54.8668128Z 2025-12-04T15:04:54.8668205Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8668478Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8668480Z 2025-12-04T15:04:54.8668567Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8668569Z 2025-12-04T15:04:54.8668571Z 2025-12-04T15:04:54.8668646Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8668733Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8668975Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-75f7e19fe7d04b8e.xml - 2025-12-04T15:04:54.8669055Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8669344Z FAILED [7.3183s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8669388Z Traceback (most recent call last): 2025-12-04T15:04:54.8669550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8669601Z getattr(self, test_name)() 2025-12-04T15:04:54.8669759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8669793Z fn() 2025-12-04T15:04:54.8669941Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8669982Z method(*args, **kwargs) 2025-12-04T15:04:54.8670130Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8670295Z method(*args, **kwargs) 2025-12-04T15:04:54.8670462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8670498Z with policy(): 2025-12-04T15:04:54.8670650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8670689Z raise RuntimeError(msg) 2025-12-04T15:04:54.8671106Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 59904 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T15:04:54.8671111Z 2025-12-04T15:04:54.8671184Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8671459Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8671461Z 2025-12-04T15:04:54.8671548Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8671613Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8671674Z ======================= 1 failed, 26 deselected in 7.48s ======================= 2025-12-04T15:04:54.8671709Z Got exit code 1 2025-12-04T15:04:54.8671931Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T15:04:54.8672059Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8672245Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-217448bec54bd68c.xml 2025-12-04T15:04:54.8672301Z ============================= test session starts ============================== 2025-12-04T15:04:54.8672413Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8672453Z cachedir: .pytest_cache 2025-12-04T15:04:54.8672612Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8672656Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8672695Z configfile: pytest.ini 2025-12-04T15:04:54.8672874Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8672962Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T15:04:54.8673014Z stepcurrent: skipping 18 already run items. 2025-12-04T15:04:54.8673056Z Running 9 items in this shard 2025-12-04T15:04:54.8673058Z 2025-12-04T15:04:54.8673410Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda I1204 14:59:28.988000 449604 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449673 2025-12-04T15:04:54.8673583Z I1204 14:59:28.989000 449604 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449674 2025-12-04T15:04:54.8673734Z I1204 14:59:28.989000 449604 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 449675 2025-12-04T15:04:54.8673882Z I1204 14:59:28.990000 449604 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 449676 2025-12-04T15:04:54.8674173Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8674223Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8674808Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8674846Z _warn_cpu_init() 2025-12-04T15:04:54.8675133Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8675182Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8675746Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8675782Z _warn_cpu_init() 2025-12-04T15:04:54.8676062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8676113Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8676673Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8676709Z _warn_cpu_init() 2025-12-04T15:04:54.8676994Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8677083Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8677472Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8677547Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8677830Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8677901Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8678198Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8678247Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8678821Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8678857Z _warn_cpu_init() 2025-12-04T15:04:54.8679143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8679215Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8679503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8679547Z return func(*args, **kwargs) 2025-12-04T15:04:54.8679774Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8679816Z return func(*args, **kwargs) 2025-12-04T15:04:54.8680039Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8680080Z return func(*args, **kwargs) 2025-12-04T15:04:54.8680341Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8680382Z return func(*args, **kwargs) 2025-12-04T15:04:54.8680602Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8680642Z return func(*args, **kwargs) 2025-12-04T15:04:54.8680860Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8680901Z return func(*args, **kwargs) 2025-12-04T15:04:54.8681118Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8681158Z return func(*args, **kwargs) 2025-12-04T15:04:54.8681375Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8681450Z return func(*args, **kwargs) 2025-12-04T15:04:54.8681665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8681705Z return func(*args, **kwargs) 2025-12-04T15:04:54.8681848Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8682008Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8682314Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8682467Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8682753Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8682876Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8683171Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8683319Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8683595Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8683744Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8684017Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8684153Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8684429Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8684575Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8685100Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T15:04:54.8685216Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8685413Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8685818Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8685962Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8686171Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8686335Z [rank0]:E1204 14:59:34.836000 449673 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8686487Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8686645Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8686930Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8687084Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8687380Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8687504Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8687780Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8687933Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8688212Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8688358Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8688633Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8688766Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8689042Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8689190Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8689708Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 58880 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T15:04:54.8689824Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8690033Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8690488Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8690600Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8690821Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8690984Z [rank2]:E1204 14:59:34.837000 449675 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8691024Z dist init r=0, world=4 2025-12-04T15:04:54.8691063Z dist init r=2, world=4 2025-12-04T15:04:54.8691198Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8691355Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8691651Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8691806Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8692087Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8692212Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8692486Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8692633Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8692907Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8693051Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8693325Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8693458Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8693733Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8693878Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8694396Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 71168 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T15:04:54.8694541Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8694734Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8695148Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8695260Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8695470Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8695630Z [rank1]:E1204 14:59:34.882000 449674 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8695668Z dist init r=1, world=4 2025-12-04T15:04:54.8695821Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8695981Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8696264Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8696417Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8696699Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8696821Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8697097Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8697242Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8697516Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8697663Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8697937Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8698073Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8698347Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8698520Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8699040Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 67072 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T15:04:54.8699168Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8699361Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8699763Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8699876Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8700101Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8700296Z [rank3]:E1204 14:59:34.887000 449676 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8700333Z dist init r=3, world=4 2025-12-04T15:04:54.8700667Z [rank0]:[W1204 14:59:35.529016664 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8700706Z FAILED [7.5174s] [ 11%] 2025-12-04T15:04:54.8700708Z 2025-12-04T15:04:54.8700764Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8700903Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.8700950Z Traceback (most recent call last): 2025-12-04T15:04:54.8701110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8701154Z self._join_processes(fn) 2025-12-04T15:04:54.8701324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8701378Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8701555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8701600Z raise RuntimeError(error) 2025-12-04T15:04:54.8701679Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8701723Z Traceback (most recent call last): 2025-12-04T15:04:54.8701882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8701924Z getattr(self, test_name)() 2025-12-04T15:04:54.8702080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8702114Z fn() 2025-12-04T15:04:54.8702264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8702324Z method(*args, **kwargs) 2025-12-04T15:04:54.8702472Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8702528Z method(*args, **kwargs) 2025-12-04T15:04:54.8702676Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8702711Z with policy(): 2025-12-04T15:04:54.8702861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8702900Z raise RuntimeError(msg) 2025-12-04T15:04:54.8703314Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T15:04:54.8703317Z 2025-12-04T15:04:54.8703392Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8703670Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8703672Z 2025-12-04T15:04:54.8703757Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8703760Z 2025-12-04T15:04:54.8703762Z 2025-12-04T15:04:54.8703854Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8703941Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8704171Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-217448bec54bd68c.xml - 2025-12-04T15:04:54.8704232Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8704523Z FAILED [7.5174s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8704570Z Traceback (most recent call last): 2025-12-04T15:04:54.8704730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8704771Z getattr(self, test_name)() 2025-12-04T15:04:54.8704928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8704962Z fn() 2025-12-04T15:04:54.8705112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8705151Z method(*args, **kwargs) 2025-12-04T15:04:54.8705300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8705341Z method(*args, **kwargs) 2025-12-04T15:04:54.8705487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8705523Z with policy(): 2025-12-04T15:04:54.8705671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8705712Z raise RuntimeError(msg) 2025-12-04T15:04:54.8706106Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T15:04:54.8706110Z 2025-12-04T15:04:54.8706181Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8706475Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8706487Z 2025-12-04T15:04:54.8706573Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8706636Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8706698Z ======================= 1 failed, 18 deselected in 7.67s ======================= 2025-12-04T15:04:54.8706734Z Got exit code 1 2025-12-04T15:04:54.8706773Z Retrying single test... 2025-12-04T15:04:54.8706972Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c42e4a4d65ee4c8f.xml 2025-12-04T15:04:54.8707029Z ============================= test session starts ============================== 2025-12-04T15:04:54.8707142Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8707182Z cachedir: .pytest_cache 2025-12-04T15:04:54.8707338Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8707382Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8707421Z configfile: pytest.ini 2025-12-04T15:04:54.8707594Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8707668Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8707938Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8707982Z Running 1 items in this shard 2025-12-04T15:04:54.8707985Z 2025-12-04T15:04:54.8708334Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda I1204 14:59:38.934000 450006 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 450075 2025-12-04T15:04:54.8708488Z I1204 14:59:38.934000 450006 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 450076 2025-12-04T15:04:54.8708639Z I1204 14:59:38.935000 450006 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 450077 2025-12-04T15:04:54.8708787Z I1204 14:59:38.935000 450006 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 450078 2025-12-04T15:04:54.8709079Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8709130Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8709701Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8709737Z _warn_cpu_init() 2025-12-04T15:04:54.8710022Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8710070Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8710686Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8710755Z _warn_cpu_init() 2025-12-04T15:04:54.8711041Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8711134Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8711418Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8711495Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8711775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8711824Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8712415Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8712452Z _warn_cpu_init() 2025-12-04T15:04:54.8712734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8712807Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8713097Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8713140Z return func(*args, **kwargs) 2025-12-04T15:04:54.8713422Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8713468Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8714035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8714072Z _warn_cpu_init() 2025-12-04T15:04:54.8714354Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8714427Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8714654Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8714720Z return func(*args, **kwargs) 2025-12-04T15:04:54.8714941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8714982Z return func(*args, **kwargs) 2025-12-04T15:04:54.8715202Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8715243Z return func(*args, **kwargs) 2025-12-04T15:04:54.8715489Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8715530Z return func(*args, **kwargs) 2025-12-04T15:04:54.8715746Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8715788Z return func(*args, **kwargs) 2025-12-04T15:04:54.8716005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8716044Z return func(*args, **kwargs) 2025-12-04T15:04:54.8716274Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8716314Z return func(*args, **kwargs) 2025-12-04T15:04:54.8716531Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8716570Z return func(*args, **kwargs) 2025-12-04T15:04:54.8716714Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8716874Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8717160Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8717313Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8717598Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8717723Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8718000Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8718146Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8718422Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8718568Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8718841Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8719008Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8719281Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8719428Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8719962Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T15:04:54.8720079Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8720312Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8720729Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8720843Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8721050Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8721215Z [rank2]:E1204 14:59:44.785000 450077 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8721253Z dist init r=2, world=4 2025-12-04T15:04:54.8721388Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8721546Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8721830Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8721982Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8722267Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8722390Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8722665Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8722810Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8723084Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8723262Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8723533Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8723670Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8723959Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8724104Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8724627Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T15:04:54.8724754Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8724948Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8725350Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8725463Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8725674Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8725836Z [rank0]:E1204 14:59:44.830000 450075 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8725874Z dist init r=0, world=4 2025-12-04T15:04:54.8726010Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8726167Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8726451Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8726607Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8726890Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8727014Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8727289Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8727457Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8727729Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8727873Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8728159Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8728293Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8728571Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8728718Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8729249Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 71168 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T15:04:54.8729361Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8729554Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8729959Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8730069Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8730321Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8730483Z [rank3]:E1204 14:59:44.836000 450078 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8730523Z dist init r=3, world=4 2025-12-04T15:04:54.8730660Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8730819Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8731105Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8731256Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8731539Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8731697Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8731971Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8732116Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8732406Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8732551Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8732824Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8732961Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8733248Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8733395Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8733912Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T15:04:54.8734027Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8734220Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8734622Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8734734Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8734943Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8735105Z [rank1]:E1204 14:59:44.846000 450076 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8735143Z dist init r=1, world=4 2025-12-04T15:04:54.8735478Z [rank0]:[W1204 14:59:45.597162724 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8735517Z FAILED [7.5160s] [100%] 2025-12-04T15:04:54.8735519Z 2025-12-04T15:04:54.8735574Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8735728Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.8735784Z Traceback (most recent call last): 2025-12-04T15:04:54.8735944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8735988Z self._join_processes(fn) 2025-12-04T15:04:54.8736158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8736213Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8736390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8736449Z raise RuntimeError(error) 2025-12-04T15:04:54.8736530Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8736573Z Traceback (most recent call last): 2025-12-04T15:04:54.8736735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8736776Z getattr(self, test_name)() 2025-12-04T15:04:54.8736935Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8736968Z fn() 2025-12-04T15:04:54.8737120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8737169Z method(*args, **kwargs) 2025-12-04T15:04:54.8737318Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8737358Z method(*args, **kwargs) 2025-12-04T15:04:54.8737506Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8737543Z with policy(): 2025-12-04T15:04:54.8737694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8737735Z raise RuntimeError(msg) 2025-12-04T15:04:54.8738132Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T15:04:54.8738134Z 2025-12-04T15:04:54.8738208Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8738487Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8738489Z 2025-12-04T15:04:54.8738578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8738582Z 2025-12-04T15:04:54.8738583Z 2025-12-04T15:04:54.8738657Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8738745Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8738976Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c42e4a4d65ee4c8f.xml - 2025-12-04T15:04:54.8739037Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8739327Z FAILED [7.5160s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8739373Z Traceback (most recent call last): 2025-12-04T15:04:54.8739534Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8739603Z getattr(self, test_name)() 2025-12-04T15:04:54.8739762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8739798Z fn() 2025-12-04T15:04:54.8739948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8739987Z method(*args, **kwargs) 2025-12-04T15:04:54.8740136Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8740216Z method(*args, **kwargs) 2025-12-04T15:04:54.8740381Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8740418Z with policy(): 2025-12-04T15:04:54.8740567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8740609Z raise RuntimeError(msg) 2025-12-04T15:04:54.8741002Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 73216 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T15:04:54.8741006Z 2025-12-04T15:04:54.8741096Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8741372Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8741374Z 2025-12-04T15:04:54.8741460Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8741525Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8741585Z ======================= 1 failed, 26 deselected in 7.67s ======================= 2025-12-04T15:04:54.8741622Z Got exit code 1 2025-12-04T15:04:54.8741661Z Retrying single test... 2025-12-04T15:04:54.8741847Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-70c0055a1c8498c6.xml 2025-12-04T15:04:54.8741904Z ============================= test session starts ============================== 2025-12-04T15:04:54.8742017Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8742056Z cachedir: .pytest_cache 2025-12-04T15:04:54.8742212Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8742258Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8742298Z configfile: pytest.ini 2025-12-04T15:04:54.8742457Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8742533Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8742802Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8742846Z Running 1 items in this shard 2025-12-04T15:04:54.8742850Z 2025-12-04T15:04:54.8743199Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda I1204 14:59:48.933000 450408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 450477 2025-12-04T15:04:54.8743351Z I1204 14:59:48.933000 450408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 450478 2025-12-04T15:04:54.8743528Z I1204 14:59:48.934000 450408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 450479 2025-12-04T15:04:54.8743678Z I1204 14:59:48.934000 450408 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 450480 2025-12-04T15:04:54.8743967Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8744016Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8744601Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8744640Z _warn_cpu_init() 2025-12-04T15:04:54.8744927Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8745002Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8745296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8745346Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8745910Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8745949Z _warn_cpu_init() 2025-12-04T15:04:54.8746235Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8746311Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8746593Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8746641Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8747207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8747242Z _warn_cpu_init() 2025-12-04T15:04:54.8747528Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8747599Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8747903Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8747960Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T15:04:54.8748530Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8748576Z _warn_cpu_init() 2025-12-04T15:04:54.8748861Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8748933Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T15:04:54.8749218Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8749262Z return func(*args, **kwargs) 2025-12-04T15:04:54.8749498Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8749542Z return func(*args, **kwargs) 2025-12-04T15:04:54.8749763Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8749803Z return func(*args, **kwargs) 2025-12-04T15:04:54.8750024Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8750064Z return func(*args, **kwargs) 2025-12-04T15:04:54.8750320Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8750360Z return func(*args, **kwargs) 2025-12-04T15:04:54.8750578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8750617Z return func(*args, **kwargs) 2025-12-04T15:04:54.8750833Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8750875Z return func(*args, **kwargs) 2025-12-04T15:04:54.8751091Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8751129Z return func(*args, **kwargs) 2025-12-04T15:04:54.8751345Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8751385Z return func(*args, **kwargs) 2025-12-04T15:04:54.8751528Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8751691Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8751977Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8752159Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8752445Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8752569Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8752857Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8753003Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8753279Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8753424Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8753715Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8753853Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8754127Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8754274Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8754796Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 60928 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T15:04:54.8754911Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8755107Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8755518Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8755631Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8755840Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8756003Z [rank0]:E1204 14:59:54.714000 450477 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8756041Z dist init r=0, world=4 2025-12-04T15:04:54.8756191Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8756364Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8756647Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8756801Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8757096Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8757220Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8757497Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8757643Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8757932Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8758080Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8758352Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8758490Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8758765Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8758913Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8759434Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 58880 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T15:04:54.8759550Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8759744Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8760148Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8760304Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8760516Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8760709Z [rank1]:E1204 14:59:54.715000 450478 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8760748Z dist init r=1, world=4 2025-12-04T15:04:54.8760884Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8761044Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8761343Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8761497Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8761783Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8761907Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8762201Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8762347Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8762619Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8762765Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8763037Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8763173Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8763448Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8763593Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8764114Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 69120 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T15:04:54.8764227Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8764420Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8764822Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8764954Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8765165Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8765330Z [rank3]:E1204 14:59:54.723000 450480 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8765369Z dist init r=3, world=4 2025-12-04T15:04:54.8765519Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8765678Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8765965Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8766116Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8766412Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8766535Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8766815Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8766963Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8767237Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8767382Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8767656Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8767789Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8768064Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8768215Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8768738Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 67072 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T15:04:54.8768853Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8772502Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8772929Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8773041Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8773263Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8773427Z [rank2]:E1204 14:59:54.726000 450479 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8773465Z dist init r=2, world=4 2025-12-04T15:04:54.8773798Z [rank0]:[W1204 14:59:54.436033922 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8773836Z FAILED [7.4158s] [100%] 2025-12-04T15:04:54.8773838Z 2025-12-04T15:04:54.8773915Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8774058Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda _ 2025-12-04T15:04:54.8774107Z Traceback (most recent call last): 2025-12-04T15:04:54.8774269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8774313Z self._join_processes(fn) 2025-12-04T15:04:54.8774486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8774539Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8774714Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8774756Z raise RuntimeError(error) 2025-12-04T15:04:54.8774836Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8774880Z Traceback (most recent call last): 2025-12-04T15:04:54.8775040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8775083Z getattr(self, test_name)() 2025-12-04T15:04:54.8775241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8775277Z fn() 2025-12-04T15:04:54.8775428Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8775470Z method(*args, **kwargs) 2025-12-04T15:04:54.8775621Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8775660Z method(*args, **kwargs) 2025-12-04T15:04:54.8775810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8775845Z with policy(): 2025-12-04T15:04:54.8775998Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8776040Z raise RuntimeError(msg) 2025-12-04T15:04:54.8776438Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 58880 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T15:04:54.8776464Z 2025-12-04T15:04:54.8776539Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8776815Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8776818Z 2025-12-04T15:04:54.8776907Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8776910Z 2025-12-04T15:04:54.8776913Z 2025-12-04T15:04:54.8777001Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8777088Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8777315Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-70c0055a1c8498c6.xml - 2025-12-04T15:04:54.8777379Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8777671Z FAILED [7.4158s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8777716Z Traceback (most recent call last): 2025-12-04T15:04:54.8777890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8777934Z getattr(self, test_name)() 2025-12-04T15:04:54.8778091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8778128Z fn() 2025-12-04T15:04:54.8778276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8778317Z method(*args, **kwargs) 2025-12-04T15:04:54.8778466Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8778506Z method(*args, **kwargs) 2025-12-04T15:04:54.8778654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8778693Z with policy(): 2025-12-04T15:04:54.8778843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8778884Z raise RuntimeError(msg) 2025-12-04T15:04:54.8779281Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 58880 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T15:04:54.8779288Z 2025-12-04T15:04:54.8782658Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8782944Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8782947Z 2025-12-04T15:04:54.8783039Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8783102Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8783168Z ======================= 1 failed, 26 deselected in 7.57s ======================= 2025-12-04T15:04:54.8783205Z Got exit code 1 2025-12-04T15:04:54.8783433Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda 2025-12-04T15:04:54.8783614Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8783801Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-970f063f51fcb70d.xml 2025-12-04T15:04:54.8783858Z ============================= test session starts ============================== 2025-12-04T15:04:54.8783976Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8784016Z cachedir: .pytest_cache 2025-12-04T15:04:54.8784175Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8784247Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8784288Z configfile: pytest.ini 2025-12-04T15:04:54.8784450Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8784530Z collecting ... collected 60 items / 19 deselected / 41 selected 2025-12-04T15:04:54.8784583Z stepcurrent: skipping 19 already run items. 2025-12-04T15:04:54.8784625Z Running 8 items in this shard 2025-12-04T15:04:54.8784627Z 2025-12-04T15:04:54.8784947Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda I1204 14:59:58.698000 450810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 450879 2025-12-04T15:04:54.8785099Z I1204 14:59:58.698000 450810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 450880 2025-12-04T15:04:54.8785251Z I1204 14:59:58.699000 450810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 450881 2025-12-04T15:04:54.8785399Z I1204 14:59:58.699000 450810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 450882 2025-12-04T15:04:54.8785760Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8785812Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8786164Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8786210Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8786559Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8786607Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8786953Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8786997Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8787277Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8787321Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8787893Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8787954Z _warn_cpu_init() 2025-12-04T15:04:54.8788230Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8788273Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8788852Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8788890Z _warn_cpu_init() 2025-12-04T15:04:54.8789178Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8789216Z fsdp_model = FSDP( 2025-12-04T15:04:54.8789511Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8789548Z fsdp_model = FSDP( 2025-12-04T15:04:54.8789822Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8789865Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8790473Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8790511Z _warn_cpu_init() 2025-12-04T15:04:54.8790784Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8790826Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8791390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8791428Z _warn_cpu_init() 2025-12-04T15:04:54.8791715Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8791754Z fsdp_model = FSDP( 2025-12-04T15:04:54.8792034Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8792094Z fsdp_model = FSDP( 2025-12-04T15:04:54.8792320Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8792378Z return func(*args, **kwargs) 2025-12-04T15:04:54.8792600Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8792640Z return func(*args, **kwargs) 2025-12-04T15:04:54.8792860Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8792917Z return func(*args, **kwargs) 2025-12-04T15:04:54.8793137Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8793177Z return func(*args, **kwargs) 2025-12-04T15:04:54.8793396Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8793435Z return func(*args, **kwargs) 2025-12-04T15:04:54.8793657Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8793713Z return func(*args, **kwargs) 2025-12-04T15:04:54.8793936Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8793975Z return func(*args, **kwargs) 2025-12-04T15:04:54.8794196Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8794236Z return func(*args, **kwargs) 2025-12-04T15:04:54.8794526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8794566Z return func(*args, **kwargs) 2025-12-04T15:04:54.8795847Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8795981Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8797262Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8797414Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8798679Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8798800Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8800058Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8800213Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8800362Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8800525Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8800817Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8800974Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8801264Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8801406Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8801705Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8801854Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8802128Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8802292Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8802565Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8802702Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8802998Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8803144Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8803618Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 0. CUDA driver allocated memory was 2453667840 and is now 4081057792. 2025-12-04T15:04:54.8803733Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8803930Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8804287Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8804400Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8804610Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8804773Z [rank0]:E1204 15:00:07.718000 450879 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8804812Z dist init r=0, world=4 2025-12-04T15:04:54.8804947Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8805107Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8805392Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8805544Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8805839Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8805976Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8806252Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8806413Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8806688Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8806833Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8807106Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8807253Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8807530Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8807675Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8808148Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 2. CUDA driver allocated memory was 2300575744 and is now 3927965696. 2025-12-04T15:04:54.8808264Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8808459Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8808812Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8808925Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8809134Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8809297Z [rank2]:E1204 15:00:07.719000 450881 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8809336Z dist init r=2, world=4 2025-12-04T15:04:54.8809471Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8809630Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8809914Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8810091Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8810630Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8810751Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8811042Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8811193Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8811471Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8811620Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8811906Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8812043Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8812320Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8812469Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8812942Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 3. CUDA driver allocated memory was 2250244096 and is now 3877634048. 2025-12-04T15:04:54.8813058Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8813251Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8813603Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8813714Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8813922Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8814088Z [rank3]:E1204 15:00:07.725000 450882 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8814125Z dist init r=3, world=4 2025-12-04T15:04:54.8814261Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8814431Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8814728Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8814880Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8815172Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8815296Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8815572Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8815717Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8816000Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8816146Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8816418Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8816553Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8816826Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8816973Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8817442Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 1. CUDA driver allocated memory was 2317352960 and is now 3944742912. 2025-12-04T15:04:54.8817555Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8817750Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8818097Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8818209Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8818417Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8818595Z [rank1]:E1204 15:00:07.780000 450880 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8818651Z dist init r=1, world=4 2025-12-04T15:04:54.8818985Z [rank0]:[W1204 15:00:07.418452149 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8819026Z FAILED [10.9214s] [ 12%] 2025-12-04T15:04:54.8819029Z 2025-12-04T15:04:54.8819086Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8819184Z ______ TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda ______ 2025-12-04T15:04:54.8819240Z Traceback (most recent call last): 2025-12-04T15:04:54.8819404Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8819448Z self._join_processes(fn) 2025-12-04T15:04:54.8819620Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8819673Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8819853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8819895Z raise RuntimeError(error) 2025-12-04T15:04:54.8819984Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8820028Z Traceback (most recent call last): 2025-12-04T15:04:54.8820229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8820271Z getattr(self, test_name)() 2025-12-04T15:04:54.8820429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8820468Z fn() 2025-12-04T15:04:54.8820618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8820658Z method(*args, **kwargs) 2025-12-04T15:04:54.8820809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8820847Z method(*args, **kwargs) 2025-12-04T15:04:54.8820997Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8821032Z with policy(): 2025-12-04T15:04:54.8821184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8821224Z raise RuntimeError(msg) 2025-12-04T15:04:54.8821571Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 3. CUDA driver allocated memory was 2250244096 and is now 3877634048. 2025-12-04T15:04:54.8821576Z 2025-12-04T15:04:54.8821651Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8821873Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8821875Z 2025-12-04T15:04:54.8821964Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8821967Z 2025-12-04T15:04:54.8821969Z 2025-12-04T15:04:54.8822045Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8822133Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8822366Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-970f063f51fcb70d.xml - 2025-12-04T15:04:54.8822458Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8822699Z FAILED [10.9214s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.8822744Z Traceback (most recent call last): 2025-12-04T15:04:54.8822907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8822951Z getattr(self, test_name)() 2025-12-04T15:04:54.8823121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8823158Z fn() 2025-12-04T15:04:54.8823307Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8823348Z method(*args, **kwargs) 2025-12-04T15:04:54.8823497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8823537Z method(*args, **kwargs) 2025-12-04T15:04:54.8823688Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8823725Z with policy(): 2025-12-04T15:04:54.8823891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8823932Z raise RuntimeError(msg) 2025-12-04T15:04:54.8824280Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 3. CUDA driver allocated memory was 2250244096 and is now 3877634048. 2025-12-04T15:04:54.8824283Z 2025-12-04T15:04:54.8824357Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8824580Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8824582Z 2025-12-04T15:04:54.8824669Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8824734Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8824799Z ====================== 1 failed, 19 deselected in 11.08s ======================= 2025-12-04T15:04:54.8824837Z Got exit code 1 2025-12-04T15:04:54.8824876Z Retrying single test... 2025-12-04T15:04:54.8825068Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e4bf6224d8b5a0bd.xml 2025-12-04T15:04:54.8825126Z ============================= test session starts ============================== 2025-12-04T15:04:54.8825246Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8825287Z cachedir: .pytest_cache 2025-12-04T15:04:54.8825447Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8825491Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8825534Z configfile: pytest.ini 2025-12-04T15:04:54.8825699Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8825778Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8825996Z stepcurrent: skipping 19 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8826054Z Running 1 items in this shard 2025-12-04T15:04:54.8826056Z 2025-12-04T15:04:54.8826358Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda I1204 15:00:12.145000 451212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 451281 2025-12-04T15:04:54.8826528Z I1204 15:00:12.145000 451212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 451282 2025-12-04T15:04:54.8826683Z I1204 15:00:12.146000 451212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 451283 2025-12-04T15:04:54.8826832Z I1204 15:00:12.146000 451212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 451284 2025-12-04T15:04:54.8827207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8827257Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8827616Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8827664Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8828029Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8828080Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8828429Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8828480Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8828761Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8828809Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8829388Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8829431Z _warn_cpu_init() 2025-12-04T15:04:54.8829711Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8829756Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8830030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8830077Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8830395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8830440Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8831019Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8831095Z _warn_cpu_init() 2025-12-04T15:04:54.8831679Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8831720Z _warn_cpu_init() 2025-12-04T15:04:54.8832286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8832348Z _warn_cpu_init() 2025-12-04T15:04:54.8832641Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8832685Z fsdp_model = FSDP( 2025-12-04T15:04:54.8832969Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8833012Z fsdp_model = FSDP( 2025-12-04T15:04:54.8833297Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8833338Z fsdp_model = FSDP( 2025-12-04T15:04:54.8833622Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8833665Z fsdp_model = FSDP( 2025-12-04T15:04:54.8833893Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8833941Z return func(*args, **kwargs) 2025-12-04T15:04:54.8834168Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8834213Z return func(*args, **kwargs) 2025-12-04T15:04:54.8834437Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8834480Z return func(*args, **kwargs) 2025-12-04T15:04:54.8834709Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8834752Z return func(*args, **kwargs) 2025-12-04T15:04:54.8834977Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8835033Z return func(*args, **kwargs) 2025-12-04T15:04:54.8835275Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8835318Z return func(*args, **kwargs) 2025-12-04T15:04:54.8835544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8835588Z return func(*args, **kwargs) 2025-12-04T15:04:54.8835826Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8835868Z return func(*args, **kwargs) 2025-12-04T15:04:54.8836163Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8836206Z return func(*args, **kwargs) 2025-12-04T15:04:54.8837490Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8837623Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8838883Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8839009Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8840323Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8840478Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8841762Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8841890Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8842043Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8842207Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8842504Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8842667Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8842955Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8843088Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8843373Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8843531Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8843808Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8843962Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8844238Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8844383Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8844680Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8844843Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8845337Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 2. CUDA driver allocated memory was 2300575744 and is now 3927965696. 2025-12-04T15:04:54.8845456Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8845660Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8846016Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8846149Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8846369Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8846535Z [rank2]:E1204 15:00:21.250000 451283 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8846582Z dist init r=2, world=4 2025-12-04T15:04:54.8846724Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8846890Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8847185Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8847344Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8847653Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8847779Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8848065Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8848216Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8848500Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8848656Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8848933Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8849099Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8849381Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8849537Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8850033Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 0. CUDA driver allocated memory was 2453667840 and is now 4081057792. 2025-12-04T15:04:54.8850156Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8850403Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8850777Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8850899Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8851112Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8851284Z [rank0]:E1204 15:00:21.253000 451281 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8851326Z dist init r=0, world=4 2025-12-04T15:04:54.8851467Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8851627Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8851919Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8852074Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8852368Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8852499Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8852776Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8852930Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8853206Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8853398Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8853676Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8853816Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8854114Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8854264Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8854740Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 1. CUDA driver allocated memory was 2317352960 and is now 3944742912. 2025-12-04T15:04:54.8854899Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8855100Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8855454Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8855581Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8855802Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8855971Z [rank1]:E1204 15:00:21.331000 451282 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8856018Z dist init r=1, world=4 2025-12-04T15:04:54.8856159Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8856323Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8856611Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8856776Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8857061Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8857188Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8857470Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8857637Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8857938Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8858087Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8858369Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8858519Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8858803Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8858953Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8859449Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1742336 on device 3. CUDA driver allocated memory was 2250244096 and is now 3877634048. 2025-12-04T15:04:54.8859572Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8859766Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8860122Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8860279Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8860495Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8860662Z [rank3]:E1204 15:00:21.334000 451284 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8860712Z dist init r=3, world=4 2025-12-04T15:04:54.8861057Z [rank0]:[W1204 15:00:21.967144725 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8861110Z FAILED [11.0215s] [100%] 2025-12-04T15:04:54.8861112Z 2025-12-04T15:04:54.8861175Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8861276Z ______ TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda ______ 2025-12-04T15:04:54.8861331Z Traceback (most recent call last): 2025-12-04T15:04:54.8861499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8861554Z self._join_processes(fn) 2025-12-04T15:04:54.8861730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8861792Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8861987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8862061Z raise RuntimeError(error) 2025-12-04T15:04:54.8862146Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8862198Z Traceback (most recent call last): 2025-12-04T15:04:54.8862359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8862408Z getattr(self, test_name)() 2025-12-04T15:04:54.8862571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8862631Z fn() 2025-12-04T15:04:54.8862785Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8862834Z method(*args, **kwargs) 2025-12-04T15:04:54.8862986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8863037Z method(*args, **kwargs) 2025-12-04T15:04:54.8863188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8863234Z with policy(): 2025-12-04T15:04:54.8863401Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8863451Z raise RuntimeError(msg) 2025-12-04T15:04:54.8863798Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 0. CUDA driver allocated memory was 2453667840 and is now 4081057792. 2025-12-04T15:04:54.8863801Z 2025-12-04T15:04:54.8863885Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8864111Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8864118Z 2025-12-04T15:04:54.8864207Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8864210Z 2025-12-04T15:04:54.8864211Z 2025-12-04T15:04:54.8864294Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8864383Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8864626Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e4bf6224d8b5a0bd.xml - 2025-12-04T15:04:54.8864690Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8864942Z FAILED [11.0215s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8864991Z Traceback (most recent call last): 2025-12-04T15:04:54.8865165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8865211Z getattr(self, test_name)() 2025-12-04T15:04:54.8865380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8865416Z fn() 2025-12-04T15:04:54.8865574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8865617Z method(*args, **kwargs) 2025-12-04T15:04:54.8865775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8865830Z method(*args, **kwargs) 2025-12-04T15:04:54.8865986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8866050Z with policy(): 2025-12-04T15:04:54.8866207Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8866249Z raise RuntimeError(msg) 2025-12-04T15:04:54.8866608Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 0. CUDA driver allocated memory was 2453667840 and is now 4081057792. 2025-12-04T15:04:54.8866610Z 2025-12-04T15:04:54.8866702Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8866927Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8866931Z 2025-12-04T15:04:54.8867031Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8867097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8867171Z ====================== 1 failed, 26 deselected in 11.18s ======================= 2025-12-04T15:04:54.8867210Z Got exit code 1 2025-12-04T15:04:54.8867259Z Retrying single test... 2025-12-04T15:04:54.8867466Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-32c4c29f642bbd03.xml 2025-12-04T15:04:54.8867535Z ============================= test session starts ============================== 2025-12-04T15:04:54.8867651Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8867699Z cachedir: .pytest_cache 2025-12-04T15:04:54.8867859Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8867912Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8867954Z configfile: pytest.ini 2025-12-04T15:04:54.8868126Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8868203Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8868431Z stepcurrent: skipping 19 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8868476Z Running 1 items in this shard 2025-12-04T15:04:54.8868478Z 2025-12-04T15:04:54.8868786Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda I1204 15:00:26.147000 451614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 451683 2025-12-04T15:04:54.8868942Z I1204 15:00:26.148000 451614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 451684 2025-12-04T15:04:54.8869101Z I1204 15:00:26.148000 451614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 451685 2025-12-04T15:04:54.8869256Z I1204 15:00:26.149000 451614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 451686 2025-12-04T15:04:54.8869620Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8869677Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8870029Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8870115Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8870515Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8870567Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8870934Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8870988Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8871274Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8871322Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8871914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8871954Z _warn_cpu_init() 2025-12-04T15:04:54.8872245Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8872288Z fsdp_model = FSDP( 2025-12-04T15:04:54.8872569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8872613Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8873184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8873226Z _warn_cpu_init() 2025-12-04T15:04:54.8873505Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8873553Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8873826Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8873873Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.8874440Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8874498Z _warn_cpu_init() 2025-12-04T15:04:54.8875080Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8875123Z _warn_cpu_init() 2025-12-04T15:04:54.8875428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8875469Z fsdp_model = FSDP( 2025-12-04T15:04:54.8875757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8875797Z fsdp_model = FSDP( 2025-12-04T15:04:54.8876089Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.8876127Z fsdp_model = FSDP( 2025-12-04T15:04:54.8876374Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8876418Z return func(*args, **kwargs) 2025-12-04T15:04:54.8876648Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8876690Z return func(*args, **kwargs) 2025-12-04T15:04:54.8876919Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8876964Z return func(*args, **kwargs) 2025-12-04T15:04:54.8877186Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.8877230Z return func(*args, **kwargs) 2025-12-04T15:04:54.8877449Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8877496Z return func(*args, **kwargs) 2025-12-04T15:04:54.8877713Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8877759Z return func(*args, **kwargs) 2025-12-04T15:04:54.8877981Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8878026Z return func(*args, **kwargs) 2025-12-04T15:04:54.8878249Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.8878294Z return func(*args, **kwargs) 2025-12-04T15:04:54.8878589Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8878634Z return func(*args, **kwargs) 2025-12-04T15:04:54.8879924Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8880072Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8881381Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8881511Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8882766Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8882890Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8884150Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T15:04:54.8884311Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T15:04:54.8884457Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8884625Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8884928Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8885088Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8885377Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8885506Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8885799Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8885948Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8886227Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8886374Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8886655Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8886792Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8887072Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8887223Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8887697Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 1. CUDA driver allocated memory was 2317352960 and is now 3944742912. 2025-12-04T15:04:54.8887817Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8888014Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8888368Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8888510Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8888724Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8888891Z [rank1]:E1204 15:00:34.945000 451684 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8888936Z dist init r=1, world=4 2025-12-04T15:04:54.8889089Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8889248Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8889539Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8889695Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8889996Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8890121Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8890435Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8890585Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8890864Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8891015Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8891291Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8891432Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8891712Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8891863Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8892337Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 0. CUDA driver allocated memory was 2453667840 and is now 4081057792. 2025-12-04T15:04:54.8892456Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8892668Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8893032Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8893152Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8893377Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8893547Z [rank0]:E1204 15:00:35.000000 451683 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8893588Z dist init r=0, world=4 2025-12-04T15:04:54.8893730Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8893891Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8894194Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8894347Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8894636Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8894763Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8895038Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8895187Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8895462Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8895611Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8895887Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8896027Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8896307Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8896455Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8896929Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 2. CUDA driver allocated memory was 2300575744 and is now 3927965696. 2025-12-04T15:04:54.8897069Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8897265Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8897613Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8897742Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8897953Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8898121Z [rank2]:E1204 15:00:35.002000 451685 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8898163Z dist init r=2, world=4 2025-12-04T15:04:54.8898299Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8898476Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8898762Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8898918Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8899204Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8899331Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8899608Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8899760Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8900040Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8900233Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8900513Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8900649Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8900929Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8901077Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8901577Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1708544 on device 3. CUDA driver allocated memory was 2250244096 and is now 3877634048. 2025-12-04T15:04:54.8901696Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8901889Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8902254Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8902370Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8902584Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8902762Z [rank3]:E1204 15:00:35.003000 451686 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8902807Z dist init r=3, world=4 2025-12-04T15:04:54.8903146Z [rank0]:[W1204 15:00:35.794855795 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8903191Z FAILED [10.7215s] [100%] 2025-12-04T15:04:54.8903194Z 2025-12-04T15:04:54.8903250Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8903354Z ______ TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda ______ 2025-12-04T15:04:54.8903400Z Traceback (most recent call last): 2025-12-04T15:04:54.8903568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8903616Z self._join_processes(fn) 2025-12-04T15:04:54.8903789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8903845Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8904024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8904071Z raise RuntimeError(error) 2025-12-04T15:04:54.8904151Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8904200Z Traceback (most recent call last): 2025-12-04T15:04:54.8904360Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8904406Z getattr(self, test_name)() 2025-12-04T15:04:54.8904564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8904603Z fn() 2025-12-04T15:04:54.8904754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8904799Z method(*args, **kwargs) 2025-12-04T15:04:54.8904950Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8904996Z method(*args, **kwargs) 2025-12-04T15:04:54.8905146Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8905213Z with policy(): 2025-12-04T15:04:54.8905365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8905410Z raise RuntimeError(msg) 2025-12-04T15:04:54.8905758Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 1. CUDA driver allocated memory was 2317352960 and is now 3944742912. 2025-12-04T15:04:54.8905760Z 2025-12-04T15:04:54.8905839Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8906073Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8906076Z 2025-12-04T15:04:54.8906169Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8906174Z 2025-12-04T15:04:54.8906176Z 2025-12-04T15:04:54.8906255Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8906343Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8906580Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-32c4c29f642bbd03.xml - 2025-12-04T15:04:54.8906652Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8906902Z FAILED [10.7215s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.8906949Z Traceback (most recent call last): 2025-12-04T15:04:54.8907116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8907160Z getattr(self, test_name)() 2025-12-04T15:04:54.8907322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8907358Z fn() 2025-12-04T15:04:54.8907511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8907552Z method(*args, **kwargs) 2025-12-04T15:04:54.8907706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8907746Z method(*args, **kwargs) 2025-12-04T15:04:54.8907899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8907937Z with policy(): 2025-12-04T15:04:54.8908091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8908133Z raise RuntimeError(msg) 2025-12-04T15:04:54.8908481Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1776128 on device 1. CUDA driver allocated memory was 2317352960 and is now 3944742912. 2025-12-04T15:04:54.8908483Z 2025-12-04T15:04:54.8908563Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8908787Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8908790Z 2025-12-04T15:04:54.8908881Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8908955Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8909034Z ====================== 1 failed, 26 deselected in 10.88s ======================= 2025-12-04T15:04:54.8909084Z Got exit code 1 2025-12-04T15:04:54.8909259Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda 2025-12-04T15:04:54.8909387Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.8909578Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76209ceae9052b7b.xml 2025-12-04T15:04:54.8909637Z ============================= test session starts ============================== 2025-12-04T15:04:54.8909767Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8909810Z cachedir: .pytest_cache 2025-12-04T15:04:54.8909970Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8910018Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8910062Z configfile: pytest.ini 2025-12-04T15:04:54.8910261Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8910341Z collecting ... collected 60 items / 20 deselected / 40 selected 2025-12-04T15:04:54.8910394Z stepcurrent: skipping 20 already run items. 2025-12-04T15:04:54.8910460Z Running 7 items in this shard 2025-12-04T15:04:54.8910462Z 2025-12-04T15:04:54.8910771Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 15:00:39.410000 452016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 452085 2025-12-04T15:04:54.8910928Z I1204 15:00:39.411000 452016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 452086 2025-12-04T15:04:54.8911082Z I1204 15:00:39.411000 452016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 452087 2025-12-04T15:04:54.8911236Z I1204 15:00:39.412000 452016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 452088 2025-12-04T15:04:54.8911599Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8911649Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8912004Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8912052Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8912402Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8912448Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8912799Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8912844Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8913418Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8913487Z _warn_cpu_init() 2025-12-04T15:04:54.8914074Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8914118Z _warn_cpu_init() 2025-12-04T15:04:54.8914691Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8914824Z _warn_cpu_init() 2025-12-04T15:04:54.8915466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8915526Z _warn_cpu_init() 2025-12-04T15:04:54.8915821Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8915918Z return func(*args, **kwargs) 2025-12-04T15:04:54.8916127Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8916333Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8916626Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8916786Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8917092Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8917233Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8917512Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8917662Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8917941Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8918114Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8918414Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8918550Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8918845Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8919094Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8919614Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T15:04:54.8919735Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8919944Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8920352Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8920469Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8920690Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8920858Z [rank2]:E1204 15:00:48.598000 452087 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8920902Z dist init r=2, world=4 2025-12-04T15:04:54.8921045Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8921207Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8921503Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8921665Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8921955Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8922079Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8922364Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8922512Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8922845Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8922992Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8923271Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8923429Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8923706Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8923857Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8924360Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T15:04:54.8924478Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8924674Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8925037Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8925164Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8925384Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8925553Z [rank3]:E1204 15:00:48.606000 452088 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8925592Z dist init r=3, world=4 2025-12-04T15:04:54.8925733Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8925892Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8926183Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8926338Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8926626Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8926750Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8927043Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8927204Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8927480Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8927628Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8927913Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8928053Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8928330Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8928491Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8928968Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T15:04:54.8929082Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8929281Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8929637Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8929752Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8929964Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8930130Z [rank1]:E1204 15:00:48.652000 452086 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8930206Z dist init r=1, world=4 2025-12-04T15:04:54.8930349Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8930510Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8930798Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8930955Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8931240Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8931405Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8931681Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8931831Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8932119Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8932271Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8932552Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8932687Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8932992Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8933144Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8933620Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T15:04:54.8933735Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8933933Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8934292Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8934404Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8934619Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8934783Z [rank0]:E1204 15:00:48.666000 452085 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8934825Z dist init r=0, world=4 2025-12-04T15:04:54.8935164Z [rank0]:[W1204 15:00:48.462471053 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8935209Z FAILED [11.1229s] [ 14%] 2025-12-04T15:04:54.8935211Z 2025-12-04T15:04:54.8935268Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8935385Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____ 2025-12-04T15:04:54.8935444Z Traceback (most recent call last): 2025-12-04T15:04:54.8935610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8935657Z self._join_processes(fn) 2025-12-04T15:04:54.8935833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8935888Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8936068Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8936124Z raise RuntimeError(error) 2025-12-04T15:04:54.8936209Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8936254Z Traceback (most recent call last): 2025-12-04T15:04:54.8936419Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8936466Z getattr(self, test_name)() 2025-12-04T15:04:54.8936624Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8936663Z fn() 2025-12-04T15:04:54.8936813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8936870Z method(*args, **kwargs) 2025-12-04T15:04:54.8937022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8937067Z method(*args, **kwargs) 2025-12-04T15:04:54.8937216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8937260Z with policy(): 2025-12-04T15:04:54.8937412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8937458Z raise RuntimeError(msg) 2025-12-04T15:04:54.8937811Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T15:04:54.8937813Z 2025-12-04T15:04:54.8937893Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8938124Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8938126Z 2025-12-04T15:04:54.8938216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8938218Z 2025-12-04T15:04:54.8938221Z 2025-12-04T15:04:54.8938296Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8938388Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8938622Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76209ceae9052b7b.xml - 2025-12-04T15:04:54.8938684Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8938937Z FAILED [11.1229s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.8938985Z Traceback (most recent call last): 2025-12-04T15:04:54.8939152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8939195Z getattr(self, test_name)() 2025-12-04T15:04:54.8939371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8939419Z fn() 2025-12-04T15:04:54.8939571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8939612Z method(*args, **kwargs) 2025-12-04T15:04:54.8939770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8939811Z method(*args, **kwargs) 2025-12-04T15:04:54.8939964Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8940013Z with policy(): 2025-12-04T15:04:54.8940206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8940251Z raise RuntimeError(msg) 2025-12-04T15:04:54.8940605Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T15:04:54.8940609Z 2025-12-04T15:04:54.8940683Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8940932Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8940934Z 2025-12-04T15:04:54.8941027Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8941094Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8941162Z ====================== 1 failed, 20 deselected in 11.28s ======================= 2025-12-04T15:04:54.8941201Z Got exit code 1 2025-12-04T15:04:54.8941245Z Retrying single test... 2025-12-04T15:04:54.8941433Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-92797088a8235f05.xml 2025-12-04T15:04:54.8941493Z ============================= test session starts ============================== 2025-12-04T15:04:54.8941605Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8941651Z cachedir: .pytest_cache 2025-12-04T15:04:54.8941808Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8941857Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8941899Z configfile: pytest.ini 2025-12-04T15:04:54.8942063Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8942138Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8942367Z stepcurrent: skipping 20 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8942410Z Running 1 items in this shard 2025-12-04T15:04:54.8942412Z 2025-12-04T15:04:54.8942725Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 15:00:53.103000 452418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 452487 2025-12-04T15:04:54.8942878Z I1204 15:00:53.103000 452418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 452488 2025-12-04T15:04:54.8943034Z I1204 15:00:53.104000 452418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 452489 2025-12-04T15:04:54.8943185Z I1204 15:00:53.104000 452418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 452490 2025-12-04T15:04:54.8943574Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8943628Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8943978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8944029Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8944391Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8944442Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8944790Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8944841Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8945423Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8945468Z _warn_cpu_init() 2025-12-04T15:04:54.8946040Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8946078Z _warn_cpu_init() 2025-12-04T15:04:54.8946640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8946679Z _warn_cpu_init() 2025-12-04T15:04:54.8947244Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8947284Z _warn_cpu_init() 2025-12-04T15:04:54.8947574Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8947621Z return func(*args, **kwargs) 2025-12-04T15:04:54.8947783Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8947962Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8948255Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8948415Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8948712Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8948846Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8949127Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8949276Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8949568Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8949716Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8949995Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8950134Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8950440Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8950588Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8951070Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T15:04:54.8951189Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8951383Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8951745Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8951859Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8952075Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8952274Z [rank0]:E1204 15:01:02.313000 452487 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8952317Z dist init r=0, world=4 2025-12-04T15:04:54.8952457Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8952625Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8952931Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8953085Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8953373Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8953499Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8953793Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8953943Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8954222Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8954373Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8954650Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8954790Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8955068Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8955218Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8955693Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 261632 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T15:04:54.8955813Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8956007Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8956370Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8956510Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8956720Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8956888Z [rank3]:E1204 15:01:02.322000 452490 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8956928Z dist init r=3, world=4 2025-12-04T15:04:54.8957068Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8957240Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8957530Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8957685Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8957983Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8958110Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8958385Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8958536Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8958811Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8958961Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8959235Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8959376Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8959654Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8959805Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8960319Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T15:04:54.8960434Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8960631Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8961013Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8961131Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8961342Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8961522Z [rank1]:E1204 15:01:02.326000 452488 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8961565Z dist init r=1, world=4 2025-12-04T15:04:54.8961703Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8961868Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8962153Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8962329Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8962614Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8962740Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8963016Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8963169Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8963450Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8963601Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8963881Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8964018Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8964297Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8964447Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8964924Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 261632 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T15:04:54.8965060Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8965253Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8965617Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8965767Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8965982Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8966145Z [rank2]:E1204 15:01:02.380000 452489 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8966189Z dist init r=2, world=4 2025-12-04T15:04:54.8966522Z [rank0]:[W1204 15:01:02.012573701 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8966579Z FAILED [11.1213s] [100%] 2025-12-04T15:04:54.8966581Z 2025-12-04T15:04:54.8966638Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8966744Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____ 2025-12-04T15:04:54.8966790Z Traceback (most recent call last): 2025-12-04T15:04:54.8966955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8967002Z self._join_processes(fn) 2025-12-04T15:04:54.8967178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8967234Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8967411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8967458Z raise RuntimeError(error) 2025-12-04T15:04:54.8967538Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8967587Z Traceback (most recent call last): 2025-12-04T15:04:54.8967747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8967795Z getattr(self, test_name)() 2025-12-04T15:04:54.8967954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8967994Z fn() 2025-12-04T15:04:54.8968145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8968189Z method(*args, **kwargs) 2025-12-04T15:04:54.8968337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8968380Z method(*args, **kwargs) 2025-12-04T15:04:54.8968529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8968575Z with policy(): 2025-12-04T15:04:54.8968727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8968777Z raise RuntimeError(msg) 2025-12-04T15:04:54.8969127Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T15:04:54.8969156Z 2025-12-04T15:04:54.8969239Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8969471Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8969474Z 2025-12-04T15:04:54.8969569Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8969571Z 2025-12-04T15:04:54.8969573Z 2025-12-04T15:04:54.8969669Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.8969758Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.8969994Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-92797088a8235f05.xml - 2025-12-04T15:04:54.8970057Z =========================== short test summary info ============================ 2025-12-04T15:04:54.8970339Z FAILED [11.1213s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8970386Z Traceback (most recent call last): 2025-12-04T15:04:54.8970567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8970611Z getattr(self, test_name)() 2025-12-04T15:04:54.8970773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8970808Z fn() 2025-12-04T15:04:54.8970959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8971004Z method(*args, **kwargs) 2025-12-04T15:04:54.8971157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8971198Z method(*args, **kwargs) 2025-12-04T15:04:54.8971349Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8971386Z with policy(): 2025-12-04T15:04:54.8971539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8971580Z raise RuntimeError(msg) 2025-12-04T15:04:54.8971931Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T15:04:54.8971935Z 2025-12-04T15:04:54.8972009Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8972239Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8972241Z 2025-12-04T15:04:54.8972330Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8972394Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.8972459Z ====================== 1 failed, 26 deselected in 11.28s ======================= 2025-12-04T15:04:54.8972496Z Got exit code 1 2025-12-04T15:04:54.8972540Z Retrying single test... 2025-12-04T15:04:54.8972727Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2eb1f1c6e40c45fe.xml 2025-12-04T15:04:54.8972803Z ============================= test session starts ============================== 2025-12-04T15:04:54.8972929Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.8972978Z cachedir: .pytest_cache 2025-12-04T15:04:54.8973135Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.8973188Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.8973233Z configfile: pytest.ini 2025-12-04T15:04:54.8973401Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.8973492Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.8973723Z stepcurrent: skipping 20 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8973771Z Running 1 items in this shard 2025-12-04T15:04:54.8973774Z 2025-12-04T15:04:54.8974086Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 15:01:06.734000 452820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 452889 2025-12-04T15:04:54.8974243Z I1204 15:01:06.734000 452820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 452890 2025-12-04T15:04:54.8974413Z I1204 15:01:06.735000 452820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 452891 2025-12-04T15:04:54.8974570Z I1204 15:01:06.735000 452820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 452892 2025-12-04T15:04:54.8974929Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8974988Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8975338Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8975393Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8975746Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8975799Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8976148Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.8976199Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.8976775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8976814Z _warn_cpu_init() 2025-12-04T15:04:54.8977381Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8977442Z _warn_cpu_init() 2025-12-04T15:04:54.8978025Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8978065Z _warn_cpu_init() 2025-12-04T15:04:54.8978629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.8978670Z _warn_cpu_init() 2025-12-04T15:04:54.8978976Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.8979023Z return func(*args, **kwargs) 2025-12-04T15:04:54.8979167Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8979335Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8979626Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8979787Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8980075Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8980231Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8980510Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8980660Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8980945Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8981092Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8981370Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8981507Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8981819Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8981969Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8982464Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T15:04:54.8982586Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8982782Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8983144Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8983273Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8983488Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8983653Z [rank2]:E1204 15:01:16.064000 452891 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.8983696Z dist init r=2, world=4 2025-12-04T15:04:54.8983837Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8983995Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8984283Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8984436Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8984722Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8984848Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8985127Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8985275Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8985554Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8985704Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8985989Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8986138Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8986414Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8986564Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8987052Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T15:04:54.8987172Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8987370Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8987741Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8987857Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8988066Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8988235Z [rank0]:E1204 15:01:16.076000 452889 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.8988274Z dist init r=0, world=4 2025-12-04T15:04:54.8988414Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8988573Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8988862Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8989016Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8989303Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8989430Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8989711Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8989862Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8990137Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8990334Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8990608Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8990747Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8991048Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8991196Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8991676Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T15:04:54.8991823Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8992021Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8992378Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8992495Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8992709Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8992873Z [rank1]:E1204 15:01:16.083000 452890 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.8992914Z dist init r=1, world=4 2025-12-04T15:04:54.8993052Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.8993213Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.8993500Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8993659Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.8993944Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8994070Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.8994348Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8994516Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8994818Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8994965Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.8995255Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8995394Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.8995674Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.8995825Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.8996319Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T15:04:54.8996435Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8996631Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.8996991Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.8997104Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.8997318Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.8997484Z [rank3]:E1204 15:01:16.127000 452892 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.8997525Z dist init r=3, world=4 2025-12-04T15:04:54.8997861Z [rank0]:[W1204 15:01:16.786899423 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.8997906Z FAILED [11.3220s] [100%] 2025-12-04T15:04:54.8997908Z 2025-12-04T15:04:54.8997967Z =================================== FAILURES =================================== 2025-12-04T15:04:54.8998070Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____ 2025-12-04T15:04:54.8998119Z Traceback (most recent call last): 2025-12-04T15:04:54.8998283Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.8998332Z self._join_processes(fn) 2025-12-04T15:04:54.8998505Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.8998580Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.8998770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.8998816Z raise RuntimeError(error) 2025-12-04T15:04:54.8998898Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.8998947Z Traceback (most recent call last): 2025-12-04T15:04:54.8999109Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.8999154Z getattr(self, test_name)() 2025-12-04T15:04:54.8999327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.8999366Z fn() 2025-12-04T15:04:54.8999517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8999563Z method(*args, **kwargs) 2025-12-04T15:04:54.8999713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.8999758Z method(*args, **kwargs) 2025-12-04T15:04:54.8999907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.8999948Z with policy(): 2025-12-04T15:04:54.9000111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9000156Z raise RuntimeError(msg) 2025-12-04T15:04:54.9000551Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T15:04:54.9000554Z 2025-12-04T15:04:54.9000632Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9000864Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.9000868Z 2025-12-04T15:04:54.9000957Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9000959Z 2025-12-04T15:04:54.9000960Z 2025-12-04T15:04:54.9001040Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9001128Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9001365Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2eb1f1c6e40c45fe.xml - 2025-12-04T15:04:54.9001426Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9001680Z FAILED [11.3220s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9001729Z Traceback (most recent call last): 2025-12-04T15:04:54.9001896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9001939Z getattr(self, test_name)() 2025-12-04T15:04:54.9002102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9002137Z fn() 2025-12-04T15:04:54.9002291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9002333Z method(*args, **kwargs) 2025-12-04T15:04:54.9002487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9002557Z method(*args, **kwargs) 2025-12-04T15:04:54.9002710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9002748Z with policy(): 2025-12-04T15:04:54.9002904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9002945Z raise RuntimeError(msg) 2025-12-04T15:04:54.9003316Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T15:04:54.9003319Z 2025-12-04T15:04:54.9003396Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9003626Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.9003630Z 2025-12-04T15:04:54.9003720Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9003783Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9003849Z ====================== 1 failed, 26 deselected in 11.48s ======================= 2025-12-04T15:04:54.9003886Z Got exit code 1 2025-12-04T15:04:54.9004082Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T15:04:54.9004211Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.9004401Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9a80c76a27456efe.xml 2025-12-04T15:04:54.9004460Z ============================= test session starts ============================== 2025-12-04T15:04:54.9004577Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9004619Z cachedir: .pytest_cache 2025-12-04T15:04:54.9004779Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9004827Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9004873Z configfile: pytest.ini 2025-12-04T15:04:54.9005035Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9005116Z collecting ... collected 60 items / 21 deselected / 39 selected 2025-12-04T15:04:54.9005169Z stepcurrent: skipping 21 already run items. 2025-12-04T15:04:54.9005217Z Running 6 items in this shard 2025-12-04T15:04:54.9005219Z 2025-12-04T15:04:54.9005526Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda I1204 15:01:20.654000 453222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453291 2025-12-04T15:04:54.9005684Z I1204 15:01:20.655000 453222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453292 2025-12-04T15:04:54.9005838Z I1204 15:01:20.655000 453222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453293 2025-12-04T15:04:54.9005989Z I1204 15:01:20.656000 453222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453294 2025-12-04T15:04:54.9006351Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9006414Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9006771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9006829Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9007184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9007229Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9007595Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9007645Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9007933Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9007981Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9008271Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9008318Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9008901Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9008944Z _warn_cpu_init() 2025-12-04T15:04:54.9009514Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9009555Z _warn_cpu_init() 2025-12-04T15:04:54.9009847Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9009888Z fsdp_model = FSDP( 2025-12-04T15:04:54.9010223Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9010262Z fsdp_model = FSDP( 2025-12-04T15:04:54.9010557Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9010602Z return func(*args, **kwargs) 2025-12-04T15:04:54.9010885Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9010929Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9011526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9011583Z _warn_cpu_init() 2025-12-04T15:04:54.9011862Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9011922Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9012500Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9012544Z _warn_cpu_init() 2025-12-04T15:04:54.9012846Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9012891Z fsdp_model = FSDP( 2025-12-04T15:04:54.9013176Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9013220Z fsdp_model = FSDP( 2025-12-04T15:04:54.9013453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9013501Z return func(*args, **kwargs) 2025-12-04T15:04:54.9013726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9013772Z return func(*args, **kwargs) 2025-12-04T15:04:54.9013998Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9014040Z return func(*args, **kwargs) 2025-12-04T15:04:54.9014265Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9014307Z return func(*args, **kwargs) 2025-12-04T15:04:54.9014530Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9014573Z return func(*args, **kwargs) 2025-12-04T15:04:54.9014798Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9014840Z return func(*args, **kwargs) 2025-12-04T15:04:54.9015062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9015104Z return func(*args, **kwargs) 2025-12-04T15:04:54.9015326Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9015379Z return func(*args, **kwargs) 2025-12-04T15:04:54.9015543Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9015711Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9016024Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9016202Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9016502Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9016638Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9016935Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9017103Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9017389Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9017544Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9017832Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9017977Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9018271Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9018426Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9018925Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 2. CUDA driver allocated memory was 2300575744 and is now 3904897024. 2025-12-04T15:04:54.9019048Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9019256Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9019622Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9019745Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9019982Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9020166Z [rank2]:E1204 15:01:30.167000 453293 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9020242Z dist init r=2, world=4 2025-12-04T15:04:54.9020388Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9020560Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9020876Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9021042Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9021339Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9021470Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9021774Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9021931Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9022221Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9022375Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9022666Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9022808Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9023102Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9023260Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9023756Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 1. CUDA driver allocated memory was 2317352960 and is now 3921674240. 2025-12-04T15:04:54.9023880Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9024086Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9024453Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9024603Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9024826Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9024997Z [rank1]:E1204 15:01:30.168000 453292 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9025041Z dist init r=1, world=4 2025-12-04T15:04:54.9025197Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9025367Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9025669Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9025837Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9026167Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9026303Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9026606Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9026772Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9027079Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9027237Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9027538Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9027696Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9028001Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9028167Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9028679Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1124864 on device 0. CUDA driver allocated memory was 2453667840 and is now 4057989120. 2025-12-04T15:04:54.9028805Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9029017Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9037159Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9037301Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9037540Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9037764Z [rank0]:E1204 15:01:30.210000 453291 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9037815Z dist init r=0, world=4 2025-12-04T15:04:54.9037977Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9038158Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9038476Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9038670Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9038989Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9039134Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9039442Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9039605Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9039915Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9040079Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9040416Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9040571Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9040887Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9041053Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9041577Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1124864 on device 3. CUDA driver allocated memory was 2250244096 and is now 3854565376. 2025-12-04T15:04:54.9041745Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9041959Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9042345Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9042483Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9042720Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9042903Z [rank3]:E1204 15:01:30.229000 453294 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9042952Z dist init r=3, world=4 2025-12-04T15:04:54.9043343Z [rank0]:[W1204 15:01:30.991772112 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9043390Z FAILED [11.5229s] [ 16%] 2025-12-04T15:04:54.9043393Z 2025-12-04T15:04:54.9043460Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9043570Z ______ TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda _______ 2025-12-04T15:04:54.9043626Z Traceback (most recent call last): 2025-12-04T15:04:54.9043808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9043863Z self._join_processes(fn) 2025-12-04T15:04:54.9044053Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9044118Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9044313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9044366Z raise RuntimeError(error) 2025-12-04T15:04:54.9044454Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9044508Z Traceback (most recent call last): 2025-12-04T15:04:54.9044685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9044736Z getattr(self, test_name)() 2025-12-04T15:04:54.9044910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9044955Z fn() 2025-12-04T15:04:54.9045120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9045169Z method(*args, **kwargs) 2025-12-04T15:04:54.9045333Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9045384Z method(*args, **kwargs) 2025-12-04T15:04:54.9045547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9045593Z with policy(): 2025-12-04T15:04:54.9045763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9045812Z raise RuntimeError(msg) 2025-12-04T15:04:54.9046206Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 1. CUDA driver allocated memory was 2317352960 and is now 3921674240. 2025-12-04T15:04:54.9046226Z 2025-12-04T15:04:54.9046311Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9046559Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9046561Z 2025-12-04T15:04:54.9046661Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9046663Z 2025-12-04T15:04:54.9046750Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9046802Z Traceback (most recent call last): 2025-12-04T15:04:54.9046982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9047031Z getattr(self, test_name)() 2025-12-04T15:04:54.9047206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9047246Z fn() 2025-12-04T15:04:54.9047414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9047458Z method(*args, **kwargs) 2025-12-04T15:04:54.9047639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9047684Z method(*args, **kwargs) 2025-12-04T15:04:54.9047852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9047893Z with policy(): 2025-12-04T15:04:54.9048060Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9048108Z raise RuntimeError(msg) 2025-12-04T15:04:54.9048485Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 2. CUDA driver allocated memory was 2300575744 and is now 3904897024. 2025-12-04T15:04:54.9048488Z 2025-12-04T15:04:54.9048570Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9048812Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9048815Z 2025-12-04T15:04:54.9048914Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9048916Z 2025-12-04T15:04:54.9048919Z 2025-12-04T15:04:54.9049006Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9049107Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9049361Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9a80c76a27456efe.xml - 2025-12-04T15:04:54.9049436Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9049704Z FAILED [11.5229s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9049758Z Traceback (most recent call last): 2025-12-04T15:04:54.9049937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9049988Z getattr(self, test_name)() 2025-12-04T15:04:54.9050233Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9050292Z fn() 2025-12-04T15:04:54.9050456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9050504Z method(*args, **kwargs) 2025-12-04T15:04:54.9050667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9050715Z method(*args, **kwargs) 2025-12-04T15:04:54.9050881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9050926Z with policy(): 2025-12-04T15:04:54.9051116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9051169Z raise RuntimeError(msg) 2025-12-04T15:04:54.9051548Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 1. CUDA driver allocated memory was 2317352960 and is now 3921674240. 2025-12-04T15:04:54.9051552Z 2025-12-04T15:04:54.9051632Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9051888Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9051891Z 2025-12-04T15:04:54.9051987Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9051989Z 2025-12-04T15:04:54.9052057Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9052107Z Traceback (most recent call last): 2025-12-04T15:04:54.9052286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9052335Z getattr(self, test_name)() 2025-12-04T15:04:54.9052510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9052549Z fn() 2025-12-04T15:04:54.9052715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9052759Z method(*args, **kwargs) 2025-12-04T15:04:54.9052925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9052969Z method(*args, **kwargs) 2025-12-04T15:04:54.9053135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9053175Z with policy(): 2025-12-04T15:04:54.9053343Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9053391Z raise RuntimeError(msg) 2025-12-04T15:04:54.9053769Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 2. CUDA driver allocated memory was 2300575744 and is now 3904897024. 2025-12-04T15:04:54.9053771Z 2025-12-04T15:04:54.9053856Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9054094Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9054096Z 2025-12-04T15:04:54.9054196Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9054265Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9054357Z ====================== 1 failed, 21 deselected in 11.67s ======================= 2025-12-04T15:04:54.9054414Z Got exit code 1 2025-12-04T15:04:54.9054462Z Retrying single test... 2025-12-04T15:04:54.9054667Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-590bc1934aa2a940.xml 2025-12-04T15:04:54.9054735Z ============================= test session starts ============================== 2025-12-04T15:04:54.9054860Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9054909Z cachedir: .pytest_cache 2025-12-04T15:04:54.9055081Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9055146Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9055192Z configfile: pytest.ini 2025-12-04T15:04:54.9055373Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9055457Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9055693Z stepcurrent: skipping 21 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9055743Z Running 1 items in this shard 2025-12-04T15:04:54.9055745Z 2025-12-04T15:04:54.9056090Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda I1204 15:01:34.821000 453624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453693 2025-12-04T15:04:54.9056262Z I1204 15:01:34.821000 453624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453694 2025-12-04T15:04:54.9056432Z I1204 15:01:34.822000 453624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453695 2025-12-04T15:04:54.9056600Z I1204 15:01:34.822000 453624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453696 2025-12-04T15:04:54.9056991Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9057049Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9057435Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9057490Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9057870Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9057926Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9058307Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9058362Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9058676Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9058725Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9059027Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9059101Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9059729Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9059771Z _warn_cpu_init() 2025-12-04T15:04:54.9060427Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9060475Z _warn_cpu_init() 2025-12-04T15:04:54.9060773Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9060839Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9061137Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9061188Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9061803Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9061849Z _warn_cpu_init() 2025-12-04T15:04:54.9062460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9062505Z _warn_cpu_init() 2025-12-04T15:04:54.9062819Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9062863Z fsdp_model = FSDP( 2025-12-04T15:04:54.9063172Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9063217Z fsdp_model = FSDP( 2025-12-04T15:04:54.9063526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9063570Z fsdp_model = FSDP( 2025-12-04T15:04:54.9063881Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9063951Z fsdp_model = FSDP( 2025-12-04T15:04:54.9064272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9064319Z return func(*args, **kwargs) 2025-12-04T15:04:54.9064570Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9064616Z return func(*args, **kwargs) 2025-12-04T15:04:54.9064871Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9064918Z return func(*args, **kwargs) 2025-12-04T15:04:54.9065160Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9065210Z return func(*args, **kwargs) 2025-12-04T15:04:54.9065448Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9065498Z return func(*args, **kwargs) 2025-12-04T15:04:54.9065746Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9065794Z return func(*args, **kwargs) 2025-12-04T15:04:54.9066030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9066080Z return func(*args, **kwargs) 2025-12-04T15:04:54.9066315Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9066366Z return func(*args, **kwargs) 2025-12-04T15:04:54.9066604Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9066653Z return func(*args, **kwargs) 2025-12-04T15:04:54.9066811Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9066993Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9067304Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9067476Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9067789Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9067926Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9068235Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9068396Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9068728Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9068887Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9069187Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9069348Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9069653Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9069818Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9070379Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 2. CUDA driver allocated memory was 2300575744 and is now 3904897024. 2025-12-04T15:04:54.9070510Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9070722Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9071102Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9071227Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9071459Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9071641Z [rank2]:E1204 15:01:44.418000 453695 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9071684Z dist init r=2, world=4 2025-12-04T15:04:54.9072015Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9072188Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9072502Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9072671Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9072983Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9073118Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9073439Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9073619Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9073923Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9074086Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9074401Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9074554Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9074853Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9075027Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9075543Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 1. CUDA driver allocated memory was 2317352960 and is now 3921674240. 2025-12-04T15:04:54.9075670Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9075885Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9076260Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9076385Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9076613Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9076792Z [rank1]:E1204 15:01:44.423000 453694 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9076837Z dist init r=1, world=4 2025-12-04T15:04:54.9076991Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9077163Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9077476Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9077647Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9077957Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9078123Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9078419Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9078581Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9078890Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9079053Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9079352Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9079501Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9079813Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9079973Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9080542Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 0. CUDA driver allocated memory was 2453667840 and is now 4057989120. 2025-12-04T15:04:54.9080668Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9080882Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9081256Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9081381Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9081611Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9081788Z [rank0]:E1204 15:01:44.490000 453693 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9081835Z dist init r=0, world=4 2025-12-04T15:04:54.9081984Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9082158Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9082468Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9082654Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9082978Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9083116Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9083426Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9083589Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9083888Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9084047Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9084361Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9084508Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9084808Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9084967Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9085478Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1158656 on device 3. CUDA driver allocated memory was 2250244096 and is now 3854565376. 2025-12-04T15:04:54.9085603Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9085813Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9086188Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9086311Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9086542Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9086718Z [rank3]:E1204 15:01:44.492000 453696 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9086765Z dist init r=3, world=4 2025-12-04T15:04:54.9087124Z [rank0]:[W1204 15:01:44.434753354 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9087205Z FAILED [11.6217s] [100%] 2025-12-04T15:04:54.9087208Z 2025-12-04T15:04:54.9087273Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9087380Z ______ TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda _______ 2025-12-04T15:04:54.9087434Z Traceback (most recent call last): 2025-12-04T15:04:54.9087610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9087662Z self._join_processes(fn) 2025-12-04T15:04:54.9087851Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9087926Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9088117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9088170Z raise RuntimeError(error) 2025-12-04T15:04:54.9088257Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9088309Z Traceback (most recent call last): 2025-12-04T15:04:54.9088482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9088531Z getattr(self, test_name)() 2025-12-04T15:04:54.9088712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9088758Z fn() 2025-12-04T15:04:54.9088921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9088969Z method(*args, **kwargs) 2025-12-04T15:04:54.9089133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9089181Z method(*args, **kwargs) 2025-12-04T15:04:54.9089343Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9089386Z with policy(): 2025-12-04T15:04:54.9089549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9089597Z raise RuntimeError(msg) 2025-12-04T15:04:54.9089967Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 1. CUDA driver allocated memory was 2317352960 and is now 3921674240. 2025-12-04T15:04:54.9089971Z 2025-12-04T15:04:54.9090057Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9090352Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9090360Z 2025-12-04T15:04:54.9090457Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9090459Z 2025-12-04T15:04:54.9090461Z 2025-12-04T15:04:54.9090548Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9090643Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9090894Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-590bc1934aa2a940.xml - 2025-12-04T15:04:54.9090960Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9091224Z FAILED [11.6217s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9091291Z Traceback (most recent call last): 2025-12-04T15:04:54.9091493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9091542Z getattr(self, test_name)() 2025-12-04T15:04:54.9091717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9091756Z fn() 2025-12-04T15:04:54.9091927Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9091972Z method(*args, **kwargs) 2025-12-04T15:04:54.9092154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9092199Z method(*args, **kwargs) 2025-12-04T15:04:54.9092363Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9092409Z with policy(): 2025-12-04T15:04:54.9092576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9092621Z raise RuntimeError(msg) 2025-12-04T15:04:54.9093012Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 1. CUDA driver allocated memory was 2317352960 and is now 3921674240. 2025-12-04T15:04:54.9093015Z 2025-12-04T15:04:54.9093099Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9093335Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9093337Z 2025-12-04T15:04:54.9093436Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9093505Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9093579Z ====================== 1 failed, 26 deselected in 11.78s ======================= 2025-12-04T15:04:54.9093620Z Got exit code 1 2025-12-04T15:04:54.9093668Z Retrying single test... 2025-12-04T15:04:54.9093873Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d691a3c633764b61.xml 2025-12-04T15:04:54.9093938Z ============================= test session starts ============================== 2025-12-04T15:04:54.9094062Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9094108Z cachedir: .pytest_cache 2025-12-04T15:04:54.9094280Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9094332Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9094377Z configfile: pytest.ini 2025-12-04T15:04:54.9094550Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9094633Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9094864Z stepcurrent: skipping 21 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9094915Z Running 1 items in this shard 2025-12-04T15:04:54.9094917Z 2025-12-04T15:04:54.9095240Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda I1204 15:01:49.116000 454026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454095 2025-12-04T15:04:54.9095404Z I1204 15:01:49.116000 454026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454096 2025-12-04T15:04:54.9095581Z I1204 15:01:49.117000 454026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454097 2025-12-04T15:04:54.9095755Z I1204 15:01:49.117000 454026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454098 2025-12-04T15:04:54.9096140Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9096191Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9096584Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9096634Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9097010Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9097058Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9097443Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9097493Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9097792Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9097841Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9098453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9098495Z _warn_cpu_init() 2025-12-04T15:04:54.9098802Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9098846Z fsdp_model = FSDP( 2025-12-04T15:04:54.9099155Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9099204Z return func(*args, **kwargs) 2025-12-04T15:04:54.9099498Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9099545Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9099839Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9099885Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9100537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9100610Z _warn_cpu_init() 2025-12-04T15:04:54.9101231Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9101274Z _warn_cpu_init() 2025-12-04T15:04:54.9101571Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9101622Z return fsdp_fn(module, **kwargs) 2025-12-04T15:04:54.9102261Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9102304Z _warn_cpu_init() 2025-12-04T15:04:54.9102611Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9102657Z fsdp_model = FSDP( 2025-12-04T15:04:54.9102962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9103007Z fsdp_model = FSDP( 2025-12-04T15:04:54.9103313Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:395: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T15:04:54.9103357Z fsdp_model = FSDP( 2025-12-04T15:04:54.9103604Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9103651Z return func(*args, **kwargs) 2025-12-04T15:04:54.9103891Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9103937Z return func(*args, **kwargs) 2025-12-04T15:04:54.9104176Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9104220Z return func(*args, **kwargs) 2025-12-04T15:04:54.9104459Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T15:04:54.9104502Z return func(*args, **kwargs) 2025-12-04T15:04:54.9104737Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9104781Z return func(*args, **kwargs) 2025-12-04T15:04:54.9105016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9105084Z return func(*args, **kwargs) 2025-12-04T15:04:54.9105319Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9105362Z return func(*args, **kwargs) 2025-12-04T15:04:54.9105596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T15:04:54.9105638Z return func(*args, **kwargs) 2025-12-04T15:04:54.9105804Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9105980Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9106295Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9106464Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9106782Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9106918Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9107217Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9107378Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9107674Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9107835Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9108135Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9108281Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9108583Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9108740Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9109251Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 0. CUDA driver allocated memory was 2453667840 and is now 4057989120. 2025-12-04T15:04:54.9109375Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9109600Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9109988Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9110114Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9110401Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9110577Z [rank0]:E1204 15:01:58.704000 454095 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9110621Z dist init r=0, world=4 2025-12-04T15:04:54.9110773Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9110945Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9111268Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9111434Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9111741Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9111876Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9112177Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9112334Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9112634Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9112792Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9113091Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9113239Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9113539Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9113697Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9114209Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 1. CUDA driver allocated memory was 2317352960 and is now 3921674240. 2025-12-04T15:04:54.9114364Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9114574Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9114956Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9115088Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9115319Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9115498Z [rank1]:E1204 15:01:58.728000 454096 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9115540Z dist init r=1, world=4 2025-12-04T15:04:54.9115688Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9115873Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9116181Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9116344Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9116652Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9116786Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9117087Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9117245Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9117540Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9117698Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9117993Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9118142Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9118438Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9118596Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9119122Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1158656 on device 2. CUDA driver allocated memory was 2300575744 and is now 3904897024. 2025-12-04T15:04:54.9119246Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9119456Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9119845Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9119969Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9120223Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9120413Z [rank2]:E1204 15:01:58.729000 454097 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9120455Z dist init r=2, world=4 2025-12-04T15:04:54.9120604Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9120774Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9121081Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9121247Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9121553Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9121686Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9121982Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9122141Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9122437Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9122594Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9122891Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9123036Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9123331Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9123516Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9124021Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1091072 on device 3. CUDA driver allocated memory was 2250244096 and is now 3854565376. 2025-12-04T15:04:54.9124156Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9124367Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9124745Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9124866Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9125104Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9125279Z [rank3]:E1204 15:01:58.745000 454098 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9125322Z dist init r=3, world=4 2025-12-04T15:04:54.9125679Z [rank0]:[W1204 15:01:58.479985874 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9125725Z FAILED [11.7224s] [100%] 2025-12-04T15:04:54.9125727Z 2025-12-04T15:04:54.9125787Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9125892Z ______ TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda _______ 2025-12-04T15:04:54.9125941Z Traceback (most recent call last): 2025-12-04T15:04:54.9126116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9126162Z self._join_processes(fn) 2025-12-04T15:04:54.9126347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9126404Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9126596Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9126643Z raise RuntimeError(error) 2025-12-04T15:04:54.9126731Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9126778Z Traceback (most recent call last): 2025-12-04T15:04:54.9126953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9126997Z getattr(self, test_name)() 2025-12-04T15:04:54.9127166Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9127204Z fn() 2025-12-04T15:04:54.9127368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9127425Z method(*args, **kwargs) 2025-12-04T15:04:54.9127586Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9127642Z method(*args, **kwargs) 2025-12-04T15:04:54.9127801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9127841Z with policy(): 2025-12-04T15:04:54.9128005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9128049Z raise RuntimeError(msg) 2025-12-04T15:04:54.9128440Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1158656 on device 2. CUDA driver allocated memory was 2300575744 and is now 3904897024. 2025-12-04T15:04:54.9128443Z 2025-12-04T15:04:54.9128528Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9128765Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9128767Z 2025-12-04T15:04:54.9128862Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9128864Z 2025-12-04T15:04:54.9128866Z 2025-12-04T15:04:54.9128945Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9129051Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9129302Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d691a3c633764b61.xml - 2025-12-04T15:04:54.9129370Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9129628Z FAILED [11.7224s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9129682Z Traceback (most recent call last): 2025-12-04T15:04:54.9129858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9129907Z getattr(self, test_name)() 2025-12-04T15:04:54.9130077Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9130116Z fn() 2025-12-04T15:04:54.9130322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9130367Z method(*args, **kwargs) 2025-12-04T15:04:54.9130535Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9130580Z method(*args, **kwargs) 2025-12-04T15:04:54.9130742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9130784Z with policy(): 2025-12-04T15:04:54.9130947Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9130990Z raise RuntimeError(msg) 2025-12-04T15:04:54.9131362Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 1158656 on device 2. CUDA driver allocated memory was 2300575744 and is now 3904897024. 2025-12-04T15:04:54.9131365Z 2025-12-04T15:04:54.9131447Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9131685Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9131716Z 2025-12-04T15:04:54.9131809Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9131877Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9131944Z ====================== 1 failed, 26 deselected in 11.89s ======================= 2025-12-04T15:04:54.9131986Z Got exit code 1 2025-12-04T15:04:54.9132169Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda 2025-12-04T15:04:54.9132305Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.9132520Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6ba3c00338b4ee7a.xml 2025-12-04T15:04:54.9132585Z ============================= test session starts ============================== 2025-12-04T15:04:54.9132706Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9132756Z cachedir: .pytest_cache 2025-12-04T15:04:54.9132923Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9132972Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9133015Z configfile: pytest.ini 2025-12-04T15:04:54.9133204Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9133284Z collecting ... collected 60 items / 22 deselected / 38 selected 2025-12-04T15:04:54.9133343Z stepcurrent: skipping 22 already run items. 2025-12-04T15:04:54.9133390Z Running 5 items in this shard 2025-12-04T15:04:54.9133393Z 2025-12-04T15:04:54.9133707Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda I1204 15:02:03.444000 454428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454497 2025-12-04T15:04:54.9133876Z I1204 15:02:03.445000 454428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454498 2025-12-04T15:04:54.9134037Z I1204 15:02:03.446000 454428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454499 2025-12-04T15:04:54.9134202Z I1204 15:02:03.446000 454428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454500 2025-12-04T15:04:54.9134588Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9134640Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9135016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9135068Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9135443Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9135493Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9135869Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9135928Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9136542Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9136607Z _warn_cpu_init() 2025-12-04T15:04:54.9137225Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9137267Z _warn_cpu_init() 2025-12-04T15:04:54.9137883Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9137928Z _warn_cpu_init() 2025-12-04T15:04:54.9138239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9138285Z return func(*args, **kwargs) 2025-12-04T15:04:54.9138890Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9138932Z _warn_cpu_init() 2025-12-04T15:04:54.9139085Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9139259Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9139569Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9139736Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9140040Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9140212Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9140513Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9140671Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9140990Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9141168Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9141465Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9141626Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9141923Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9142083Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9142598Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T15:04:54.9142724Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9142935Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9143304Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9143430Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9143656Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9143832Z [rank0]:E1204 15:02:13.503000 454497 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9143874Z dist init r=0, world=4 2025-12-04T15:04:54.9144023Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9144193Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9144501Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9144664Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9144970Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9145104Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9145401Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9145589Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9145888Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9146046Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9146354Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9146504Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9146803Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9146963Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9147474Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T15:04:54.9147597Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9147808Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9148172Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9148296Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9148527Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9148702Z [rank2]:E1204 15:02:13.549000 454499 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9148744Z dist init r=2, world=4 2025-12-04T15:04:54.9148893Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9149065Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9149373Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9149538Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9149840Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9149995Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9150327Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9150487Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9150795Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9150956Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9151255Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9151400Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9151712Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9151870Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9152366Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 3896508416. 2025-12-04T15:04:54.9152489Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9152699Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9153067Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9153187Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9153414Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9153588Z [rank1]:E1204 15:02:13.553000 454498 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9153630Z dist init r=1, world=4 2025-12-04T15:04:54.9153775Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9153946Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9154251Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9154417Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9154756Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9154891Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9155190Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9155361Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9155657Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9155815Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9156110Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9156266Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9156565Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9156723Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9157218Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T15:04:54.9157342Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9157550Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9157916Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9158037Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9158265Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9158444Z [rank3]:E1204 15:02:13.566000 454500 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9158486Z dist init r=3, world=4 2025-12-04T15:04:54.9158845Z [rank0]:[W1204 15:02:13.279768262 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9158901Z FAILED [12.1251s] [ 20%] 2025-12-04T15:04:54.9158903Z 2025-12-04T15:04:54.9158968Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9159084Z ________ TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda _________ 2025-12-04T15:04:54.9159141Z Traceback (most recent call last): 2025-12-04T15:04:54.9159315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9159364Z self._join_processes(fn) 2025-12-04T15:04:54.9159550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9159611Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9159813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9159864Z raise RuntimeError(error) 2025-12-04T15:04:54.9159949Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9160001Z Traceback (most recent call last): 2025-12-04T15:04:54.9160211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9160261Z getattr(self, test_name)() 2025-12-04T15:04:54.9160430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9160470Z fn() 2025-12-04T15:04:54.9160646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9160694Z method(*args, **kwargs) 2025-12-04T15:04:54.9160855Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9160902Z method(*args, **kwargs) 2025-12-04T15:04:54.9161061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9161105Z with policy(): 2025-12-04T15:04:54.9161267Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9161314Z raise RuntimeError(msg) 2025-12-04T15:04:54.9161676Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T15:04:54.9161681Z 2025-12-04T15:04:54.9161762Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9161994Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9161997Z 2025-12-04T15:04:54.9162091Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9162094Z 2025-12-04T15:04:54.9162096Z 2025-12-04T15:04:54.9162178Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9162272Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9162529Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6ba3c00338b4ee7a.xml - 2025-12-04T15:04:54.9162597Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9162855Z FAILED [12.1251s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9162907Z Traceback (most recent call last): 2025-12-04T15:04:54.9163083Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9163161Z getattr(self, test_name)() 2025-12-04T15:04:54.9163332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9163371Z fn() 2025-12-04T15:04:54.9163537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9163588Z method(*args, **kwargs) 2025-12-04T15:04:54.9163750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9163795Z method(*args, **kwargs) 2025-12-04T15:04:54.9163971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9164013Z with policy(): 2025-12-04T15:04:54.9164177Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9164223Z raise RuntimeError(msg) 2025-12-04T15:04:54.9164593Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T15:04:54.9164595Z 2025-12-04T15:04:54.9164688Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9164920Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9164923Z 2025-12-04T15:04:54.9165017Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9165084Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9165157Z ====================== 1 failed, 22 deselected in 12.29s ======================= 2025-12-04T15:04:54.9165199Z Got exit code 1 2025-12-04T15:04:54.9165246Z Retrying single test... 2025-12-04T15:04:54.9165446Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-77de46af66b54f9d.xml 2025-12-04T15:04:54.9165509Z ============================= test session starts ============================== 2025-12-04T15:04:54.9165631Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9165675Z cachedir: .pytest_cache 2025-12-04T15:04:54.9165845Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9165899Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9165941Z configfile: pytest.ini 2025-12-04T15:04:54.9166121Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9166201Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9166426Z stepcurrent: skipping 22 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9166475Z Running 1 items in this shard 2025-12-04T15:04:54.9166477Z 2025-12-04T15:04:54.9166790Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda I1204 15:02:18.278000 454830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454899 2025-12-04T15:04:54.9166958Z I1204 15:02:18.279000 454830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454900 2025-12-04T15:04:54.9167120Z I1204 15:02:18.280000 454830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454901 2025-12-04T15:04:54.9167293Z I1204 15:02:18.280000 454830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454902 2025-12-04T15:04:54.9167689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9167741Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9168132Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9168184Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9168555Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9168607Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9168997Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9169047Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9169665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9169707Z _warn_cpu_init() 2025-12-04T15:04:54.9170345Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9170388Z _warn_cpu_init() 2025-12-04T15:04:54.9170990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9171033Z _warn_cpu_init() 2025-12-04T15:04:54.9171341Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9171391Z return func(*args, **kwargs) 2025-12-04T15:04:54.9171995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9172067Z _warn_cpu_init() 2025-12-04T15:04:54.9172221Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9172395Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9172707Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9172887Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9173192Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9173327Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9173625Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9173796Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9174095Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9174254Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9174549Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9174697Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9174994Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9175155Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9175655Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T15:04:54.9175782Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9175989Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9176357Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9176479Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9176720Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9176907Z [rank3]:E1204 15:02:28.052000 454902 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9177054Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9177226Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9177544Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9177710Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9178019Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9178152Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9178465Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9178623Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9178921Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9179082Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9179379Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9179526Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9179823Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9179983Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9180525Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T15:04:54.9180652Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9180861Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9181226Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9181378Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9181606Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9181784Z [rank0]:E1204 15:02:28.052000 454899 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9181825Z dist init r=3, world=4 2025-12-04T15:04:54.9181869Z dist init r=0, world=4 2025-12-04T15:04:54.9182030Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9182204Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9182511Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9182677Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9182995Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9183128Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9183423Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9183582Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9183878Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9184041Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9184341Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9184486Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9184785Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9184942Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9185444Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T15:04:54.9185569Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9185781Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9186179Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9186302Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9186536Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9186736Z [rank2]:E1204 15:02:28.068000 454901 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9186780Z dist init r=2, world=4 2025-12-04T15:04:54.9186933Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9187108Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9187416Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9187594Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9187904Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9188038Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9188340Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9188499Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9188798Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9188960Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9189263Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9189412Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9189714Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9189876Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9190418Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 3896508416. 2025-12-04T15:04:54.9190574Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9190784Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9191156Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9191292Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9191522Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9191705Z [rank1]:E1204 15:02:28.126000 454900 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9191748Z dist init r=1, world=4 2025-12-04T15:04:54.9192112Z [rank0]:[W1204 15:02:28.762093678 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9192169Z FAILED [11.7228s] [100%] 2025-12-04T15:04:54.9192172Z 2025-12-04T15:04:54.9192236Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9192340Z ________ TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda _________ 2025-12-04T15:04:54.9192391Z Traceback (most recent call last): 2025-12-04T15:04:54.9192567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9192618Z self._join_processes(fn) 2025-12-04T15:04:54.9192805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9192864Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9193056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9193106Z raise RuntimeError(error) 2025-12-04T15:04:54.9193192Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.9193242Z Traceback (most recent call last): 2025-12-04T15:04:54.9193417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9193465Z getattr(self, test_name)() 2025-12-04T15:04:54.9193637Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9193680Z fn() 2025-12-04T15:04:54.9193843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9193892Z method(*args, **kwargs) 2025-12-04T15:04:54.9194054Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9194101Z method(*args, **kwargs) 2025-12-04T15:04:54.9194263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9194304Z with policy(): 2025-12-04T15:04:54.9194469Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9194515Z raise RuntimeError(msg) 2025-12-04T15:04:54.9194895Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T15:04:54.9194914Z 2025-12-04T15:04:54.9194996Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9195231Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9195234Z 2025-12-04T15:04:54.9195329Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9195331Z 2025-12-04T15:04:54.9195333Z 2025-12-04T15:04:54.9195429Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9195524Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9195777Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-77de46af66b54f9d.xml - 2025-12-04T15:04:54.9195843Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9196101Z FAILED [11.7228s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.9196151Z Traceback (most recent call last): 2025-12-04T15:04:54.9196342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9196389Z getattr(self, test_name)() 2025-12-04T15:04:54.9196567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9196607Z fn() 2025-12-04T15:04:54.9196774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9196821Z method(*args, **kwargs) 2025-12-04T15:04:54.9196985Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9197030Z method(*args, **kwargs) 2025-12-04T15:04:54.9197192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9197236Z with policy(): 2025-12-04T15:04:54.9197399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9197448Z raise RuntimeError(msg) 2025-12-04T15:04:54.9197817Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T15:04:54.9197821Z 2025-12-04T15:04:54.9197904Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9198135Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9198138Z 2025-12-04T15:04:54.9198239Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9198308Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9198380Z ====================== 1 failed, 26 deselected in 11.88s ======================= 2025-12-04T15:04:54.9198420Z Got exit code 1 2025-12-04T15:04:54.9198467Z Retrying single test... 2025-12-04T15:04:54.9198669Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d112d86db9fcabed.xml 2025-12-04T15:04:54.9198749Z ============================= test session starts ============================== 2025-12-04T15:04:54.9198886Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9198935Z cachedir: .pytest_cache 2025-12-04T15:04:54.9199105Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9199158Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9199200Z configfile: pytest.ini 2025-12-04T15:04:54.9199379Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9199469Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9199698Z stepcurrent: skipping 22 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9199747Z Running 1 items in this shard 2025-12-04T15:04:54.9199752Z 2025-12-04T15:04:54.9200069Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda I1204 15:02:32.599000 455232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455301 2025-12-04T15:04:54.9200295Z I1204 15:02:32.599000 455232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455302 2025-12-04T15:04:54.9200479Z I1204 15:02:32.600000 455232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455303 2025-12-04T15:04:54.9200645Z I1204 15:02:32.600000 455232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455304 2025-12-04T15:04:54.9201033Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9201090Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9201472Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9201526Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9201909Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9201960Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9202345Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9202393Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9203021Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9203061Z _warn_cpu_init() 2025-12-04T15:04:54.9203673Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9203751Z _warn_cpu_init() 2025-12-04T15:04:54.9204068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9204118Z return func(*args, **kwargs) 2025-12-04T15:04:54.9204750Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9204793Z _warn_cpu_init() 2025-12-04T15:04:54.9205417Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T15:04:54.9205460Z _warn_cpu_init() 2025-12-04T15:04:54.9205617Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9205794Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9206110Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9206280Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9206592Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9206729Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9207031Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9207190Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9207491Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9207651Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9207948Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9208095Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9208404Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9208574Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9209085Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T15:04:54.9209211Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9209420Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9209940Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9210079Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9210343Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9210522Z [rank0]:E1204 15:02:42.375000 455301 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9210563Z dist init r=0, world=4 2025-12-04T15:04:54.9210715Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9210886Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9211195Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9211361Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9211670Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9211802Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9212099Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9212255Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9212552Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9212712Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9213007Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9213183Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9213480Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9213641Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9214158Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3896508416. 2025-12-04T15:04:54.9214282Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9214492Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9214872Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9214996Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9215223Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9215403Z [rank1]:E1204 15:02:42.452000 455302 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9215448Z dist init r=1, world=4 2025-12-04T15:04:54.9215595Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9215767Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9216073Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9216237Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9216540Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9216677Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9216974Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9217135Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9217429Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9217611Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9217907Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9218052Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9218359Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9218516Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9219014Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T15:04:54.9219137Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9219358Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9219725Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9219846Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9220074Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9220294Z [rank3]:E1204 15:02:42.461000 455304 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9220347Z dist init r=3, world=4 2025-12-04T15:04:54.9220495Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9220670Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9220977Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9221146Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9221454Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9221589Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9221889Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9222047Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9222381Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9222540Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9222840Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9223002Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9223300Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9223463Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9223973Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T15:04:54.9224099Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9224307Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9224675Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9224796Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9225027Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9225206Z [rank2]:E1204 15:02:42.462000 455303 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9225248Z dist init r=2, world=4 2025-12-04T15:04:54.9225610Z [rank0]:[W1204 15:02:42.073125272 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9225656Z FAILED [11.6238s] [100%] 2025-12-04T15:04:54.9225658Z 2025-12-04T15:04:54.9225723Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9225827Z ________ TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda _________ 2025-12-04T15:04:54.9225879Z Traceback (most recent call last): 2025-12-04T15:04:54.9226054Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9226105Z self._join_processes(fn) 2025-12-04T15:04:54.9226291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9226353Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9226555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9226617Z raise RuntimeError(error) 2025-12-04T15:04:54.9226704Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9226756Z Traceback (most recent call last): 2025-12-04T15:04:54.9226929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9226980Z getattr(self, test_name)() 2025-12-04T15:04:54.9227149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9227190Z fn() 2025-12-04T15:04:54.9227363Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9227412Z method(*args, **kwargs) 2025-12-04T15:04:54.9227574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9227623Z method(*args, **kwargs) 2025-12-04T15:04:54.9227782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9227825Z with policy(): 2025-12-04T15:04:54.9227986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9228045Z raise RuntimeError(msg) 2025-12-04T15:04:54.9228409Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T15:04:54.9228416Z 2025-12-04T15:04:54.9228497Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9228730Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9228733Z 2025-12-04T15:04:54.9228828Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9228830Z 2025-12-04T15:04:54.9228832Z 2025-12-04T15:04:54.9228916Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9229011Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9229264Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d112d86db9fcabed.xml - 2025-12-04T15:04:54.9229331Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9229586Z FAILED [11.6238s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9229637Z Traceback (most recent call last): 2025-12-04T15:04:54.9229817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9229864Z getattr(self, test_name)() 2025-12-04T15:04:54.9230039Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9230079Z fn() 2025-12-04T15:04:54.9230273Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9230318Z method(*args, **kwargs) 2025-12-04T15:04:54.9230482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9230530Z method(*args, **kwargs) 2025-12-04T15:04:54.9230707Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9230766Z with policy(): 2025-12-04T15:04:54.9230930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9230979Z raise RuntimeError(msg) 2025-12-04T15:04:54.9231354Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T15:04:54.9231357Z 2025-12-04T15:04:54.9231441Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9231683Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9231687Z 2025-12-04T15:04:54.9231784Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9231854Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9231925Z ====================== 1 failed, 26 deselected in 11.78s ======================= 2025-12-04T15:04:54.9231966Z Got exit code 1 2025-12-04T15:04:54.9232144Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda 2025-12-04T15:04:54.9232296Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.9232504Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d786f26c592427e2.xml 2025-12-04T15:04:54.9232566Z ============================= test session starts ============================== 2025-12-04T15:04:54.9232691Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9232738Z cachedir: .pytest_cache 2025-12-04T15:04:54.9232912Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9232963Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9233013Z configfile: pytest.ini 2025-12-04T15:04:54.9233186Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9233271Z collecting ... collected 60 items / 23 deselected / 37 selected 2025-12-04T15:04:54.9233332Z stepcurrent: skipping 23 already run items. 2025-12-04T15:04:54.9233380Z Running 4 items in this shard 2025-12-04T15:04:54.9233383Z 2025-12-04T15:04:54.9233708Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda I1204 15:02:46.944000 455634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455703 2025-12-04T15:04:54.9233876Z I1204 15:02:46.945000 455634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455704 2025-12-04T15:04:54.9234041Z I1204 15:02:46.946000 455634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455705 2025-12-04T15:04:54.9234201Z I1204 15:02:46.946000 455634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455706 2025-12-04T15:04:54.9234587Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9234639Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9235168Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9235264Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9235643Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9235696Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9236231Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9236302Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9236678Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9236755Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9237275Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9237342Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9237722Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9237772Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9238291Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9238354Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9238510Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9238686Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9239004Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9239177Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9239485Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9239624Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9239944Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9240107Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9240565Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9240745Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9241042Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9241194Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9241496Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9241677Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9242188Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3401580544. 2025-12-04T15:04:54.9242313Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9242525Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9242898Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9243026Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9243254Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9243430Z [rank2]:E1204 15:02:54.266000 455705 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9243478Z dist init r=2, world=4 2025-12-04T15:04:54.9243626Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9243800Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9244112Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9244283Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9244602Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9244755Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9245049Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9245209Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9245517Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9245676Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9245975Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9246134Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9246434Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9246593Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9247095Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3554672640. 2025-12-04T15:04:54.9247222Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9247430Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9247806Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9247927Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9248157Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9248331Z [rank0]:E1204 15:02:54.271000 455703 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9248377Z dist init r=0, world=4 2025-12-04T15:04:54.9248527Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9248700Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9249008Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9249198Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9249506Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9249638Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9249947Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9250105Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9250452Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9250610Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9250922Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9251071Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9251367Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9251530Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9252030Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3418357760. 2025-12-04T15:04:54.9252160Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9252369Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9252743Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9252869Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9253095Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9253274Z [rank1]:E1204 15:02:54.283000 455704 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9253317Z dist init r=1, world=4 2025-12-04T15:04:54.9253467Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9253652Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9253975Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9254141Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9254467Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9254600Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9254896Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9255056Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9255362Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9255522Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9255820Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9255968Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9256263Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9256423Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9256921Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 3351248896. 2025-12-04T15:04:54.9257042Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9257252Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9257622Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9257745Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9257967Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9258144Z [rank3]:E1204 15:02:54.310000 455706 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9258209Z dist init r=3, world=4 2025-12-04T15:04:54.9258255Z FAILED [8.7179s] [ 25%] 2025-12-04T15:04:54.9258257Z 2025-12-04T15:04:54.9258318Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9258422Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda ______ 2025-12-04T15:04:54.9258472Z Traceback (most recent call last): 2025-12-04T15:04:54.9258646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9258692Z self._join_processes(fn) 2025-12-04T15:04:54.9258892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9258952Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9259141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9259191Z raise RuntimeError(error) 2025-12-04T15:04:54.9259277Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9259328Z Traceback (most recent call last): 2025-12-04T15:04:54.9259500Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9259547Z getattr(self, test_name)() 2025-12-04T15:04:54.9259727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9259767Z fn() 2025-12-04T15:04:54.9259928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9259974Z method(*args, **kwargs) 2025-12-04T15:04:54.9260133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9260212Z method(*args, **kwargs) 2025-12-04T15:04:54.9260373Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9260418Z with policy(): 2025-12-04T15:04:54.9260580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9260625Z raise RuntimeError(msg) 2025-12-04T15:04:54.9260993Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3401580544. 2025-12-04T15:04:54.9260996Z 2025-12-04T15:04:54.9261080Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9261317Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9261320Z 2025-12-04T15:04:54.9261416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9261418Z 2025-12-04T15:04:54.9261420Z 2025-12-04T15:04:54.9261504Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9261598Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9261849Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d786f26c592427e2.xml - 2025-12-04T15:04:54.9261914Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9262169Z FAILED [8.7179s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9262250Z Traceback (most recent call last): 2025-12-04T15:04:54.9262426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9262472Z getattr(self, test_name)() 2025-12-04T15:04:54.9262645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9262682Z fn() 2025-12-04T15:04:54.9262845Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9262891Z method(*args, **kwargs) 2025-12-04T15:04:54.9263069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9263112Z method(*args, **kwargs) 2025-12-04T15:04:54.9263274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9263315Z with policy(): 2025-12-04T15:04:54.9263479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9263523Z raise RuntimeError(msg) 2025-12-04T15:04:54.9263900Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3401580544. 2025-12-04T15:04:54.9263902Z 2025-12-04T15:04:54.9263983Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9264218Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9264221Z 2025-12-04T15:04:54.9264316Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9264385Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9264455Z ======================= 1 failed, 23 deselected in 8.88s ======================= 2025-12-04T15:04:54.9264495Z Got exit code 1 2025-12-04T15:04:54.9264544Z Retrying single test... 2025-12-04T15:04:54.9264743Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c29c1977505dbf7c.xml 2025-12-04T15:04:54.9264807Z ============================= test session starts ============================== 2025-12-04T15:04:54.9264927Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9264973Z cachedir: .pytest_cache 2025-12-04T15:04:54.9265141Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9265195Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9265238Z configfile: pytest.ini 2025-12-04T15:04:54.9265415Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9265495Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9265728Z stepcurrent: skipping 23 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9265776Z Running 1 items in this shard 2025-12-04T15:04:54.9265778Z 2025-12-04T15:04:54.9266095Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda I1204 15:02:58.118000 456028 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456097 2025-12-04T15:04:54.9266260Z I1204 15:02:58.118000 456028 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456098 2025-12-04T15:04:54.9266449Z I1204 15:02:58.119000 456028 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456099 2025-12-04T15:04:54.9266611Z I1204 15:02:58.120000 456028 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456100 2025-12-04T15:04:54.9266995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9267049Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9267583Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9267655Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9268046Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9268100Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9268474Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9268523Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9269046Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9269112Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9269630Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9269693Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9270067Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9270116Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9270673Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9270739Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9270891Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9271082Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9271412Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9271578Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9271895Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9272032Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9272327Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9272490Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9272801Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9272959Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9273263Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9273410Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9273718Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9273876Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9274379Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 3351248896. 2025-12-04T15:04:54.9274507Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9274714Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9275088Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9275209Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9275439Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9275614Z [rank3]:E1204 15:03:05.345000 456100 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9275682Z dist init r=3, world=4 2025-12-04T15:04:54.9275828Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9276002Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9276310Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9276488Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9276796Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9276930Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9277228Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9277396Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9277692Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9277849Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9278145Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9278292Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9278590Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9278752Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9279248Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3401580544. 2025-12-04T15:04:54.9279373Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9279581Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9279952Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9280074Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9280346Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9280537Z [rank2]:E1204 15:03:05.349000 456099 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9280578Z dist init r=2, world=4 2025-12-04T15:04:54.9280731Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9280901Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9281223Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9281388Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9281693Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9281824Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9282139Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9282297Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9282592Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9286547Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9286860Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9287008Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9287317Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9287478Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9287983Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3554672640. 2025-12-04T15:04:54.9288106Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9288319Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9288694Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9288865Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9289092Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9289267Z [rank0]:E1204 15:03:05.396000 456097 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9289312Z dist init r=0, world=4 2025-12-04T15:04:54.9289476Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9289650Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9289961Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9290128Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9290497Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9290634Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9290933Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9291093Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9291391Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9291549Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9291848Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9291994Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9292296Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9292455Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9292955Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3418357760. 2025-12-04T15:04:54.9293080Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9293318Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9293708Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9293830Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9294078Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9294255Z [rank1]:E1204 15:03:05.401000 456098 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9294297Z dist init r=1, world=4 2025-12-04T15:04:54.9294343Z FAILED [8.5184s] [100%] 2025-12-04T15:04:54.9294347Z 2025-12-04T15:04:54.9294410Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9294515Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda ______ 2025-12-04T15:04:54.9294566Z Traceback (most recent call last): 2025-12-04T15:04:54.9294742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9294801Z self._join_processes(fn) 2025-12-04T15:04:54.9294990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9295050Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9295242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9295291Z raise RuntimeError(error) 2025-12-04T15:04:54.9295379Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.9295427Z Traceback (most recent call last): 2025-12-04T15:04:54.9295603Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9295649Z getattr(self, test_name)() 2025-12-04T15:04:54.9295821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9295859Z fn() 2025-12-04T15:04:54.9296023Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9296070Z method(*args, **kwargs) 2025-12-04T15:04:54.9296232Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9296278Z method(*args, **kwargs) 2025-12-04T15:04:54.9296439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9296479Z with policy(): 2025-12-04T15:04:54.9296642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9296686Z raise RuntimeError(msg) 2025-12-04T15:04:54.9297054Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 3351248896. 2025-12-04T15:04:54.9297056Z 2025-12-04T15:04:54.9297139Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9297380Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9297412Z 2025-12-04T15:04:54.9297510Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9297516Z 2025-12-04T15:04:54.9297518Z 2025-12-04T15:04:54.9297601Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9297698Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9297951Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c29c1977505dbf7c.xml - 2025-12-04T15:04:54.9298018Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9298282Z FAILED [8.5184s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.9298336Z Traceback (most recent call last): 2025-12-04T15:04:54.9298511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9298559Z getattr(self, test_name)() 2025-12-04T15:04:54.9298728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9298768Z fn() 2025-12-04T15:04:54.9298939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9298986Z method(*args, **kwargs) 2025-12-04T15:04:54.9299148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9299194Z method(*args, **kwargs) 2025-12-04T15:04:54.9299356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9299400Z with policy(): 2025-12-04T15:04:54.9299562Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9299607Z raise RuntimeError(msg) 2025-12-04T15:04:54.9299974Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 3351248896. 2025-12-04T15:04:54.9299977Z 2025-12-04T15:04:54.9300056Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9300340Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9300342Z 2025-12-04T15:04:54.9300434Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9300504Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9300571Z ======================= 1 failed, 26 deselected in 8.68s ======================= 2025-12-04T15:04:54.9300614Z Got exit code 1 2025-12-04T15:04:54.9300656Z Retrying single test... 2025-12-04T15:04:54.9300859Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0e239b8de501c89d.xml 2025-12-04T15:04:54.9300922Z ============================= test session starts ============================== 2025-12-04T15:04:54.9301043Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9301087Z cachedir: .pytest_cache 2025-12-04T15:04:54.9301260Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9301310Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9301372Z configfile: pytest.ini 2025-12-04T15:04:54.9301566Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9301650Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9301878Z stepcurrent: skipping 23 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9301929Z Running 1 items in this shard 2025-12-04T15:04:54.9301931Z 2025-12-04T15:04:54.9302265Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda I1204 15:03:09.081000 456422 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456491 2025-12-04T15:04:54.9302432Z I1204 15:03:09.082000 456422 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456492 2025-12-04T15:04:54.9302596Z I1204 15:03:09.082000 456422 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456493 2025-12-04T15:04:54.9302758Z I1204 15:03:09.083000 456422 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456494 2025-12-04T15:04:54.9303156Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9303210Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9303739Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9303808Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9304187Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9304239Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9304765Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9304832Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9305209Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9305262Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9305781Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9305845Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9306222Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9306294Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9306811Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9306873Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9307040Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9307215Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9307529Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9307694Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9308019Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9308157Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9308453Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9308615Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9308911Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9309071Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9309366Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9309515Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9309817Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9309979Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9310520Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3554672640. 2025-12-04T15:04:54.9310644Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9310869Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9311259Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9311384Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9311626Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9311803Z [rank0]:E1204 15:03:16.319000 456491 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9311848Z dist init r=0, world=4 2025-12-04T15:04:54.9311998Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9312169Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9312491Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9312660Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9312965Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9313102Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9313396Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9313557Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9313854Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9314015Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9314316Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9314464Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9314769Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9314928Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9315438Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 3351248896. 2025-12-04T15:04:54.9315585Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9315795Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9316187Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9316308Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9316534Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9316712Z [rank3]:E1204 15:03:16.328000 456494 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9316757Z dist init r=3, world=4 2025-12-04T15:04:54.9316904Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9317088Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9317394Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9317560Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9317867Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9317999Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9318294Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9318452Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9318751Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9318907Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9319201Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9319346Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9319645Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9319805Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9320344Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3401580544. 2025-12-04T15:04:54.9320469Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9320676Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9321064Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9321187Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9321413Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9321603Z [rank2]:E1204 15:03:16.374000 456493 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9321647Z dist init r=2, world=4 2025-12-04T15:04:54.9321798Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9321969Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9322276Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9322440Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9322745Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9322875Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9323171Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9323329Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9323629Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9323788Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9324078Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9324224Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9324532Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9324708Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9325204Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3418357760. 2025-12-04T15:04:54.9325347Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9325556Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9325927Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9326047Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9326285Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9326462Z [rank1]:E1204 15:03:16.380000 456492 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9326503Z dist init r=1, world=4 2025-12-04T15:04:54.9326547Z FAILED [8.4188s] [100%] 2025-12-04T15:04:54.9326550Z 2025-12-04T15:04:54.9326611Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9326715Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda ______ 2025-12-04T15:04:54.9326764Z Traceback (most recent call last): 2025-12-04T15:04:54.9326937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9326984Z self._join_processes(fn) 2025-12-04T15:04:54.9327171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9327229Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9327422Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9327469Z raise RuntimeError(error) 2025-12-04T15:04:54.9327557Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9327606Z Traceback (most recent call last): 2025-12-04T15:04:54.9327780Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9327826Z getattr(self, test_name)() 2025-12-04T15:04:54.9327997Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9328036Z fn() 2025-12-04T15:04:54.9328197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9328244Z method(*args, **kwargs) 2025-12-04T15:04:54.9328404Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9328449Z method(*args, **kwargs) 2025-12-04T15:04:54.9328609Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9328674Z with policy(): 2025-12-04T15:04:54.9328836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9328882Z raise RuntimeError(msg) 2025-12-04T15:04:54.9329245Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3554672640. 2025-12-04T15:04:54.9329247Z 2025-12-04T15:04:54.9329329Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9329575Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9329579Z 2025-12-04T15:04:54.9329675Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9329678Z 2025-12-04T15:04:54.9329680Z 2025-12-04T15:04:54.9329762Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9329858Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9330116Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0e239b8de501c89d.xml - 2025-12-04T15:04:54.9330217Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9330473Z FAILED [8.4188s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9330521Z Traceback (most recent call last): 2025-12-04T15:04:54.9330699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9330747Z getattr(self, test_name)() 2025-12-04T15:04:54.9330918Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9330956Z fn() 2025-12-04T15:04:54.9331120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9331164Z method(*args, **kwargs) 2025-12-04T15:04:54.9331327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9331369Z method(*args, **kwargs) 2025-12-04T15:04:54.9331529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9331568Z with policy(): 2025-12-04T15:04:54.9331730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9331774Z raise RuntimeError(msg) 2025-12-04T15:04:54.9332135Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3554672640. 2025-12-04T15:04:54.9332137Z 2025-12-04T15:04:54.9332216Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9332449Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9332452Z 2025-12-04T15:04:54.9332544Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9332612Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9332701Z ======================= 1 failed, 26 deselected in 8.58s ======================= 2025-12-04T15:04:54.9332758Z Got exit code 1 2025-12-04T15:04:54.9332941Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda 2025-12-04T15:04:54.9333076Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.9333281Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c1c32af9e1d32582.xml 2025-12-04T15:04:54.9333341Z ============================= test session starts ============================== 2025-12-04T15:04:54.9333477Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9333522Z cachedir: .pytest_cache 2025-12-04T15:04:54.9333692Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9333742Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9333787Z configfile: pytest.ini 2025-12-04T15:04:54.9333958Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9334040Z collecting ... collected 60 items / 24 deselected / 36 selected 2025-12-04T15:04:54.9334095Z stepcurrent: skipping 24 already run items. 2025-12-04T15:04:54.9334160Z Running 3 items in this shard 2025-12-04T15:04:54.9334163Z 2025-12-04T15:04:54.9334491Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 15:03:20.065000 456816 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456885 2025-12-04T15:04:54.9334657Z I1204 15:03:20.066000 456816 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456886 2025-12-04T15:04:54.9334828Z I1204 15:03:20.066000 456816 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456887 2025-12-04T15:04:54.9335004Z I1204 15:03:20.067000 456816 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456888 2025-12-04T15:04:54.9335389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9335446Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9335975Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9336042Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9336419Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9336469Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9336845Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9336894Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9337416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9337505Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9338032Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9338097Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9338475Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9338528Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9339054Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9339119Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9339273Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9339445Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9339759Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9339924Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9340275Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9340412Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9340712Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9340873Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9341172Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9341332Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9341629Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9341774Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9342109Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9342267Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9342796Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9342923Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9343137Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9343527Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9343668Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9343894Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9344070Z [rank3]:E1204 15:03:26.189000 456888 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9344114Z dist init r=3, world=4 2025-12-04T15:04:54.9344266Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9344433Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9344741Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9344905Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9345210Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9345345Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9345642Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9345804Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9346104Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9346261Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9346566Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9346726Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9347021Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9347191Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9347697Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9347823Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9348029Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9348429Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9348553Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9348778Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9348954Z [rank2]:E1204 15:03:26.191000 456887 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9348995Z dist init r=2, world=4 2025-12-04T15:04:54.9349145Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9349316Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9349626Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9349789Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9350094Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9350271Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9350568Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9350726Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9351021Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9351206Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9351499Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9351644Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9351953Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9352112Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9352618Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9352753Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9352965Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9353344Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9353469Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9353695Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9353872Z [rank0]:E1204 15:03:26.197000 456885 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9353912Z dist init r=0, world=4 2025-12-04T15:04:54.9354060Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9354231Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9354536Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9354700Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9355002Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9355137Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9355428Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9355607Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9355904Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9356062Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9356372Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9356516Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9356811Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9356967Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9357486Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9357606Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9357813Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9358195Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9358316Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9358543Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9358717Z [rank1]:E1204 15:03:26.244000 456886 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9358762Z dist init r=1, world=4 2025-12-04T15:04:54.9358803Z FAILED [7.2165s] [ 33%] 2025-12-04T15:04:54.9358806Z 2025-12-04T15:04:54.9358866Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9358970Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___ 2025-12-04T15:04:54.9359020Z Traceback (most recent call last): 2025-12-04T15:04:54.9359190Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9359239Z self._join_processes(fn) 2025-12-04T15:04:54.9359423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9359480Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9359668Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9359735Z raise RuntimeError(error) 2025-12-04T15:04:54.9359833Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9359884Z Traceback (most recent call last): 2025-12-04T15:04:54.9360053Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9360100Z getattr(self, test_name)() 2025-12-04T15:04:54.9360312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9360352Z fn() 2025-12-04T15:04:54.9360511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9360572Z method(*args, **kwargs) 2025-12-04T15:04:54.9360732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9360780Z method(*args, **kwargs) 2025-12-04T15:04:54.9360939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9360981Z with policy(): 2025-12-04T15:04:54.9361143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9361188Z raise RuntimeError(msg) 2025-12-04T15:04:54.9361579Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9361581Z 2025-12-04T15:04:54.9361664Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9361913Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9361919Z 2025-12-04T15:04:54.9362012Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9362014Z 2025-12-04T15:04:54.9362079Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9362127Z Traceback (most recent call last): 2025-12-04T15:04:54.9362300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9362346Z getattr(self, test_name)() 2025-12-04T15:04:54.9362517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9362554Z fn() 2025-12-04T15:04:54.9362717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9362759Z method(*args, **kwargs) 2025-12-04T15:04:54.9362920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9362964Z method(*args, **kwargs) 2025-12-04T15:04:54.9363125Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9363163Z with policy(): 2025-12-04T15:04:54.9363328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9363372Z raise RuntimeError(msg) 2025-12-04T15:04:54.9363747Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9363750Z 2025-12-04T15:04:54.9363832Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9364090Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9364107Z 2025-12-04T15:04:54.9364202Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9364204Z 2025-12-04T15:04:54.9364206Z 2025-12-04T15:04:54.9364286Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9364381Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9364639Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c1c32af9e1d32582.xml - 2025-12-04T15:04:54.9364707Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9364969Z FAILED [7.2165s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9365022Z Traceback (most recent call last): 2025-12-04T15:04:54.9365201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9365244Z getattr(self, test_name)() 2025-12-04T15:04:54.9365425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9365461Z fn() 2025-12-04T15:04:54.9365620Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9365663Z method(*args, **kwargs) 2025-12-04T15:04:54.9365823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9365865Z method(*args, **kwargs) 2025-12-04T15:04:54.9366027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9366067Z with policy(): 2025-12-04T15:04:54.9366229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9366271Z raise RuntimeError(msg) 2025-12-04T15:04:54.9366645Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9366647Z 2025-12-04T15:04:54.9366727Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9366971Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9366975Z 2025-12-04T15:04:54.9367067Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9367070Z 2025-12-04T15:04:54.9367135Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9367183Z Traceback (most recent call last): 2025-12-04T15:04:54.9367356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9367402Z getattr(self, test_name)() 2025-12-04T15:04:54.9367570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9367607Z fn() 2025-12-04T15:04:54.9367770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9367814Z method(*args, **kwargs) 2025-12-04T15:04:54.9367985Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9368041Z method(*args, **kwargs) 2025-12-04T15:04:54.9368200Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9368240Z with policy(): 2025-12-04T15:04:54.9368399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9368446Z raise RuntimeError(msg) 2025-12-04T15:04:54.9368827Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9368829Z 2025-12-04T15:04:54.9368911Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9369158Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9369161Z 2025-12-04T15:04:54.9369255Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9369322Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9369402Z ======================= 1 failed, 24 deselected in 7.37s ======================= 2025-12-04T15:04:54.9369442Z Got exit code 1 2025-12-04T15:04:54.9369487Z Retrying single test... 2025-12-04T15:04:54.9369688Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-16f6adac785f4a70.xml 2025-12-04T15:04:54.9369752Z ============================= test session starts ============================== 2025-12-04T15:04:54.9369872Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9369919Z cachedir: .pytest_cache 2025-12-04T15:04:54.9370089Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9370141Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9370235Z configfile: pytest.ini 2025-12-04T15:04:54.9370409Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9370489Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9370732Z stepcurrent: skipping 24 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9370782Z Running 1 items in this shard 2025-12-04T15:04:54.9370784Z 2025-12-04T15:04:54.9371109Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 15:03:29.740000 457194 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457263 2025-12-04T15:04:54.9371277Z I1204 15:03:29.741000 457194 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457264 2025-12-04T15:04:54.9371438Z I1204 15:03:29.741000 457194 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457265 2025-12-04T15:04:54.9371603Z I1204 15:03:29.742000 457194 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457266 2025-12-04T15:04:54.9371986Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9372038Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9372579Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9372663Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9373055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9373108Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9373627Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9373692Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9374088Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9374138Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9374660Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9374725Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9375098Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9375149Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9375662Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9375726Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9375878Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9376052Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9376360Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9376528Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9376835Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9376992Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9377287Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9377446Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9377752Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9377908Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9378203Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9378351Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9378653Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9378814Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9379325Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9379450Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9379656Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9380039Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9380159Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9380408Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9380584Z [rank2]:E1204 15:03:35.916000 457265 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9380628Z dist init r=2, world=4 2025-12-04T15:04:54.9380775Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9380944Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9381249Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9381427Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9381746Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9381876Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9382172Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9382341Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9382634Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9382796Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9383102Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9383247Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9383540Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9383698Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9384208Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9384332Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9384542Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9384920Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9385042Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9385265Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9385439Z [rank3]:E1204 15:03:35.929000 457266 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9385479Z dist init r=3, world=4 2025-12-04T15:04:54.9385628Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9385795Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9386112Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9386289Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9386592Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9386737Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9387030Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9387190Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9387483Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9387654Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9387949Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9388092Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9388387Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9388544Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9389055Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9389177Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9389384Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9389764Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9389885Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9390109Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9390323Z [rank0]:E1204 15:03:35.990000 457263 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9390384Z dist init r=0, world=4 2025-12-04T15:04:54.9390530Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9390712Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9391017Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9391181Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9391503Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9391639Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9391938Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9392092Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9392399Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9392555Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9392849Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9392994Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9393291Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9393448Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9393954Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9394078Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9394285Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9394667Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9394784Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9395011Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9395208Z [rank1]:E1204 15:03:35.996000 457264 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9395252Z dist init r=1, world=4 2025-12-04T15:04:54.9395294Z FAILED [7.3167s] [100%] 2025-12-04T15:04:54.9395297Z 2025-12-04T15:04:54.9395357Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9395464Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___ 2025-12-04T15:04:54.9395512Z Traceback (most recent call last): 2025-12-04T15:04:54.9395699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9395746Z self._join_processes(fn) 2025-12-04T15:04:54.9395929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9395987Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9396175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9396221Z raise RuntimeError(error) 2025-12-04T15:04:54.9396306Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9396353Z Traceback (most recent call last): 2025-12-04T15:04:54.9396537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9396582Z getattr(self, test_name)() 2025-12-04T15:04:54.9396750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9396787Z fn() 2025-12-04T15:04:54.9396947Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9396991Z method(*args, **kwargs) 2025-12-04T15:04:54.9397149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9397192Z method(*args, **kwargs) 2025-12-04T15:04:54.9397354Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9397393Z with policy(): 2025-12-04T15:04:54.9397561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9397604Z raise RuntimeError(msg) 2025-12-04T15:04:54.9397980Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9397984Z 2025-12-04T15:04:54.9398064Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9398312Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9398314Z 2025-12-04T15:04:54.9398409Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9398412Z 2025-12-04T15:04:54.9398414Z 2025-12-04T15:04:54.9398494Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9398589Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9398837Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-16f6adac785f4a70.xml - 2025-12-04T15:04:54.9398916Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9399193Z FAILED [7.3167s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9399245Z Traceback (most recent call last): 2025-12-04T15:04:54.9399417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9399465Z getattr(self, test_name)() 2025-12-04T15:04:54.9399633Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9399670Z fn() 2025-12-04T15:04:54.9399840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9399884Z method(*args, **kwargs) 2025-12-04T15:04:54.9400044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9400088Z method(*args, **kwargs) 2025-12-04T15:04:54.9400294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9400336Z with policy(): 2025-12-04T15:04:54.9400499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9400561Z raise RuntimeError(msg) 2025-12-04T15:04:54.9400940Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9400943Z 2025-12-04T15:04:54.9401022Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9401267Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9401271Z 2025-12-04T15:04:54.9401362Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9401431Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9401498Z ======================= 1 failed, 26 deselected in 7.47s ======================= 2025-12-04T15:04:54.9401542Z Got exit code 1 2025-12-04T15:04:54.9401585Z Retrying single test... 2025-12-04T15:04:54.9401789Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e1cd3e6c1d45a339.xml 2025-12-04T15:04:54.9401850Z ============================= test session starts ============================== 2025-12-04T15:04:54.9401971Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9402018Z cachedir: .pytest_cache 2025-12-04T15:04:54.9402187Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9402235Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9402281Z configfile: pytest.ini 2025-12-04T15:04:54.9402454Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9402536Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9402776Z stepcurrent: skipping 24 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9402825Z Running 1 items in this shard 2025-12-04T15:04:54.9402828Z 2025-12-04T15:04:54.9403154Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 15:03:39.821000 457572 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457641 2025-12-04T15:04:54.9403349Z I1204 15:03:39.822000 457572 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457642 2025-12-04T15:04:54.9403513Z I1204 15:03:39.822000 457572 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457643 2025-12-04T15:04:54.9403673Z I1204 15:03:39.823000 457572 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457644 2025-12-04T15:04:54.9404077Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9404132Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9404655Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9404722Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9405112Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9405163Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9405539Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9405590Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9405960Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9406010Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9406527Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9406593Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9407109Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9407175Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9407694Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9407767Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9407940Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9408110Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9408421Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9408597Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9408903Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9409039Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9409336Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9409508Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9409803Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9409961Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9410287Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9410434Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9410731Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9410892Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9411403Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9411526Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9411736Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9412119Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9412242Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9412483Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9412674Z [rank1]:E1204 15:03:45.988000 457642 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9412716Z dist init r=1, world=4 2025-12-04T15:04:54.9412867Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9413040Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9413358Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9413525Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9413830Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9413964Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9414272Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9414432Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9414724Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9414887Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9415187Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9415331Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9415627Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9415784Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9416293Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9416414Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9416625Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9417007Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9417150Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9417380Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9417556Z [rank3]:E1204 15:03:45.990000 457644 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9417597Z dist init r=3, world=4 2025-12-04T15:04:54.9417752Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9417923Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9418227Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9418392Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9418707Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9418842Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9419135Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9419292Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9419585Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9419741Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9420039Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9420220Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9420518Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9420679Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9421186Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9421307Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9421526Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9421920Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9422040Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9422279Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9422454Z [rank0]:E1204 15:03:45.994000 457641 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9422498Z dist init r=0, world=4 2025-12-04T15:04:54.9422644Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9422814Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9423134Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9423298Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9423601Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9423732Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9424029Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9424184Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9424480Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9424638Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9424934Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9425080Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9425375Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9425533Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9426045Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9426204Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9426411Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9426803Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9426924Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9427147Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9427324Z [rank2]:E1204 15:03:46.039000 457643 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9427364Z dist init r=2, world=4 2025-12-04T15:04:54.9427406Z FAILED [7.3168s] [100%] 2025-12-04T15:04:54.9427408Z 2025-12-04T15:04:54.9427467Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9427584Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___ 2025-12-04T15:04:54.9427633Z Traceback (most recent call last): 2025-12-04T15:04:54.9427806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9427853Z self._join_processes(fn) 2025-12-04T15:04:54.9428036Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9428095Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9428284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9428330Z raise RuntimeError(error) 2025-12-04T15:04:54.9428418Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9428465Z Traceback (most recent call last): 2025-12-04T15:04:54.9428639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9428683Z getattr(self, test_name)() 2025-12-04T15:04:54.9428852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9428888Z fn() 2025-12-04T15:04:54.9429049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9429096Z method(*args, **kwargs) 2025-12-04T15:04:54.9429255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9429298Z method(*args, **kwargs) 2025-12-04T15:04:54.9429456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9429500Z with policy(): 2025-12-04T15:04:54.9429661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9429707Z raise RuntimeError(msg) 2025-12-04T15:04:54.9430078Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9430104Z 2025-12-04T15:04:54.9430225Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9430468Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9430471Z 2025-12-04T15:04:54.9430567Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9430569Z 2025-12-04T15:04:54.9430634Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9430682Z Traceback (most recent call last): 2025-12-04T15:04:54.9430869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9430918Z getattr(self, test_name)() 2025-12-04T15:04:54.9431084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9431125Z fn() 2025-12-04T15:04:54.9431283Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9431328Z method(*args, **kwargs) 2025-12-04T15:04:54.9431485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9431530Z method(*args, **kwargs) 2025-12-04T15:04:54.9431706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9431748Z with policy(): 2025-12-04T15:04:54.9431911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9431955Z raise RuntimeError(msg) 2025-12-04T15:04:54.9432326Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9432330Z 2025-12-04T15:04:54.9432410Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9432653Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9432656Z 2025-12-04T15:04:54.9432748Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9432750Z 2025-12-04T15:04:54.9432814Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.9432860Z Traceback (most recent call last): 2025-12-04T15:04:54.9433033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9433078Z getattr(self, test_name)() 2025-12-04T15:04:54.9433247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9433282Z fn() 2025-12-04T15:04:54.9433442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9433484Z method(*args, **kwargs) 2025-12-04T15:04:54.9433644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9433686Z method(*args, **kwargs) 2025-12-04T15:04:54.9433846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9433885Z with policy(): 2025-12-04T15:04:54.9434048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9434107Z raise RuntimeError(msg) 2025-12-04T15:04:54.9434505Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9434507Z 2025-12-04T15:04:54.9434585Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9434828Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9434830Z 2025-12-04T15:04:54.9434934Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9434936Z 2025-12-04T15:04:54.9434938Z 2025-12-04T15:04:54.9435019Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9435114Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9435362Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e1cd3e6c1d45a339.xml - 2025-12-04T15:04:54.9435429Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9435703Z FAILED [7.3168s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9435754Z Traceback (most recent call last): 2025-12-04T15:04:54.9435927Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9435974Z getattr(self, test_name)() 2025-12-04T15:04:54.9436143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9436181Z fn() 2025-12-04T15:04:54.9436340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9436383Z method(*args, **kwargs) 2025-12-04T15:04:54.9436544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9436586Z method(*args, **kwargs) 2025-12-04T15:04:54.9436748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9436787Z with policy(): 2025-12-04T15:04:54.9436949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9436992Z raise RuntimeError(msg) 2025-12-04T15:04:54.9437365Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9437368Z 2025-12-04T15:04:54.9437445Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9437692Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9437694Z 2025-12-04T15:04:54.9437786Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9437788Z 2025-12-04T15:04:54.9437853Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9437900Z Traceback (most recent call last): 2025-12-04T15:04:54.9438072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9438129Z getattr(self, test_name)() 2025-12-04T15:04:54.9438308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9438345Z fn() 2025-12-04T15:04:54.9438504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9438545Z method(*args, **kwargs) 2025-12-04T15:04:54.9438704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9438747Z method(*args, **kwargs) 2025-12-04T15:04:54.9438917Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9438960Z with policy(): 2025-12-04T15:04:54.9439122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9439170Z raise RuntimeError(msg) 2025-12-04T15:04:54.9439540Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9439543Z 2025-12-04T15:04:54.9439621Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9439880Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9439882Z 2025-12-04T15:04:54.9439977Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9439979Z 2025-12-04T15:04:54.9440040Z Process 3 exited with error code 10 and exception: 2025-12-04T15:04:54.9440090Z Traceback (most recent call last): 2025-12-04T15:04:54.9440295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9440342Z getattr(self, test_name)() 2025-12-04T15:04:54.9440511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9440548Z fn() 2025-12-04T15:04:54.9440710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9440757Z method(*args, **kwargs) 2025-12-04T15:04:54.9440915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9440960Z method(*args, **kwargs) 2025-12-04T15:04:54.9441118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9441161Z with policy(): 2025-12-04T15:04:54.9441319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9441365Z raise RuntimeError(msg) 2025-12-04T15:04:54.9441744Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9441748Z 2025-12-04T15:04:54.9441824Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9442068Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9442071Z 2025-12-04T15:04:54.9442161Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9442247Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9442329Z ======================= 1 failed, 26 deselected in 7.46s ======================= 2025-12-04T15:04:54.9442371Z Got exit code 1 2025-12-04T15:04:54.9442560Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T15:04:54.9442698Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.9442896Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-08fa1ec62248e0cb.xml 2025-12-04T15:04:54.9442973Z ============================= test session starts ============================== 2025-12-04T15:04:54.9443092Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9443138Z cachedir: .pytest_cache 2025-12-04T15:04:54.9443307Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9443357Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9443401Z configfile: pytest.ini 2025-12-04T15:04:54.9443573Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9443650Z collecting ... collected 60 items / 25 deselected / 35 selected 2025-12-04T15:04:54.9443720Z stepcurrent: skipping 25 already run items. 2025-12-04T15:04:54.9443766Z Running 2 items in this shard 2025-12-04T15:04:54.9443770Z 2025-12-04T15:04:54.9444097Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda I1204 15:03:49.714000 457950 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458019 2025-12-04T15:04:54.9444263Z I1204 15:03:49.714000 457950 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458020 2025-12-04T15:04:54.9444425Z I1204 15:03:49.715000 457950 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 458021 2025-12-04T15:04:54.9444585Z I1204 15:03:49.715000 457950 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 458022 2025-12-04T15:04:54.9444966Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9445018Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9445326Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9445399Z {} 2025-12-04T15:04:54.9445509Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9445593Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9446119Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9446188Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9446563Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9446636Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9447009Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9447057Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9447363Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9447442Z {} 2025-12-04T15:04:54.9447556Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9447635Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9448154Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9448229Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9448531Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9448598Z {} 2025-12-04T15:04:54.9448705Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9448783Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9449304Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9449367Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9449741Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9449791Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9450094Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9450161Z {} 2025-12-04T15:04:54.9450308Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9450383Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9450902Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9450964Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9451132Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9451319Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9451633Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9451797Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9452119Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9452254Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9452549Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9452708Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9453014Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9453173Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9453463Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9453610Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9453905Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9454068Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9454585Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9454715Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9454925Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9455306Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9455430Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9455654Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9455853Z [rank2]:E1204 15:03:55.748000 458021 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9455896Z dist init r=2, world=4 2025-12-04T15:04:54.9456044Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9456215Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9456536Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9456700Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9457004Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9457139Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9457444Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9457603Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9457895Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9458054Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9458347Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9458492Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9458788Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9458944Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9459451Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9459574Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9459784Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9460164Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9460359Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9460586Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9460760Z [rank0]:E1204 15:03:55.750000 458019 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9460803Z dist init r=0, world=4 2025-12-04T15:04:54.9460948Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9461131Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9461437Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9461603Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9461925Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9462058Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9462355Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9462510Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9462805Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9462959Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9463256Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9463399Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9463694Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9463856Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9464358Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9464483Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9464690Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9465092Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9465213Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9465438Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9465625Z [rank3]:E1204 15:03:55.802000 458022 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9465770Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9465940Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9466244Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9466422Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9466726Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9466859Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9467153Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9467309Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9467603Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9467758Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9468051Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9468196Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9468490Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9468648Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9469158Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9469291Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9469507Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9469888Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9470007Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9470282Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9470457Z [rank1]:E1204 15:03:55.802000 458020 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9470503Z dist init r=1, world=4 2025-12-04T15:04:54.9470544Z dist init r=3, world=4 2025-12-04T15:04:54.9470588Z FAILED [7.1162s] [ 50%] 2025-12-04T15:04:54.9470591Z 2025-12-04T15:04:54.9470650Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9470755Z ___ TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda ___ 2025-12-04T15:04:54.9470819Z Traceback (most recent call last): 2025-12-04T15:04:54.9470993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9471043Z self._join_processes(fn) 2025-12-04T15:04:54.9471227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9471285Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9471478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9471525Z raise RuntimeError(error) 2025-12-04T15:04:54.9471610Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9471660Z Traceback (most recent call last): 2025-12-04T15:04:54.9471835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9471882Z getattr(self, test_name)() 2025-12-04T15:04:54.9472049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9472087Z fn() 2025-12-04T15:04:54.9472246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9472290Z method(*args, **kwargs) 2025-12-04T15:04:54.9472448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9472492Z method(*args, **kwargs) 2025-12-04T15:04:54.9472648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9472690Z with policy(): 2025-12-04T15:04:54.9472850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9472894Z raise RuntimeError(msg) 2025-12-04T15:04:54.9473265Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9473283Z 2025-12-04T15:04:54.9473363Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9473626Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9473629Z 2025-12-04T15:04:54.9473722Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9473724Z 2025-12-04T15:04:54.9473786Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9473836Z Traceback (most recent call last): 2025-12-04T15:04:54.9474006Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9474063Z getattr(self, test_name)() 2025-12-04T15:04:54.9474230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9474267Z fn() 2025-12-04T15:04:54.9474427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9474471Z method(*args, **kwargs) 2025-12-04T15:04:54.9474630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9474671Z method(*args, **kwargs) 2025-12-04T15:04:54.9474830Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9474881Z with policy(): 2025-12-04T15:04:54.9475043Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9475086Z raise RuntimeError(msg) 2025-12-04T15:04:54.9475460Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9475464Z 2025-12-04T15:04:54.9475542Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9475786Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9475788Z 2025-12-04T15:04:54.9475880Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9475882Z 2025-12-04T15:04:54.9475884Z 2025-12-04T15:04:54.9475966Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9476059Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9476310Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-08fa1ec62248e0cb.xml - 2025-12-04T15:04:54.9476377Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9476636Z FAILED [7.1162s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9476685Z Traceback (most recent call last): 2025-12-04T15:04:54.9476857Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9476903Z getattr(self, test_name)() 2025-12-04T15:04:54.9477070Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9477107Z fn() 2025-12-04T15:04:54.9477264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9477324Z method(*args, **kwargs) 2025-12-04T15:04:54.9477482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9477537Z method(*args, **kwargs) 2025-12-04T15:04:54.9477696Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9477737Z with policy(): 2025-12-04T15:04:54.9477897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9477941Z raise RuntimeError(msg) 2025-12-04T15:04:54.9478326Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9478330Z 2025-12-04T15:04:54.9478408Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9478655Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9478657Z 2025-12-04T15:04:54.9478749Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9478751Z 2025-12-04T15:04:54.9478814Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9478877Z Traceback (most recent call last): 2025-12-04T15:04:54.9479050Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9479095Z getattr(self, test_name)() 2025-12-04T15:04:54.9479264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9479300Z fn() 2025-12-04T15:04:54.9479459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9479502Z method(*args, **kwargs) 2025-12-04T15:04:54.9479661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9479702Z method(*args, **kwargs) 2025-12-04T15:04:54.9479860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9479900Z with policy(): 2025-12-04T15:04:54.9480061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9480105Z raise RuntimeError(msg) 2025-12-04T15:04:54.9480512Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9480516Z 2025-12-04T15:04:54.9480593Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9480833Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9480835Z 2025-12-04T15:04:54.9480926Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9480994Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9481061Z ======================= 1 failed, 25 deselected in 7.28s ======================= 2025-12-04T15:04:54.9481101Z Got exit code 1 2025-12-04T15:04:54.9481144Z Retrying single test... 2025-12-04T15:04:54.9481345Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2d684093973c4c23.xml 2025-12-04T15:04:54.9481435Z ============================= test session starts ============================== 2025-12-04T15:04:54.9481556Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9481599Z cachedir: .pytest_cache 2025-12-04T15:04:54.9481766Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9481815Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9481859Z configfile: pytest.ini 2025-12-04T15:04:54.9482033Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9482125Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9482364Z stepcurrent: skipping 25 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9482413Z Running 1 items in this shard 2025-12-04T15:04:54.9482415Z 2025-12-04T15:04:54.9482738Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda I1204 15:03:59.297000 458328 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458397 2025-12-04T15:04:54.9482915Z I1204 15:03:59.298000 458328 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458398 2025-12-04T15:04:54.9483077Z I1204 15:03:59.298000 458328 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 458399 2025-12-04T15:04:54.9483239Z I1204 15:03:59.299000 458328 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 458400 2025-12-04T15:04:54.9483630Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9483682Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9483993Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9484061Z {} 2025-12-04T15:04:54.9484171Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9484250Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9484778Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9484845Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9485220Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9485270Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9485641Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9485690Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9486005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9486084Z {} 2025-12-04T15:04:54.9486194Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9486270Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9486799Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9486863Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9487167Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9487232Z {} 2025-12-04T15:04:54.9487341Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9487415Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9487941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9488004Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9488380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9488429Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9488733Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9488797Z {} 2025-12-04T15:04:54.9488906Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9488981Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9489497Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9489562Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9489717Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9489894Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9490237Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9490423Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9490745Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9490877Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9491190Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9491348Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9491642Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9491800Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9492107Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9492257Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9492552Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9492710Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9493218Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9493344Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9493553Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9493935Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9494057Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9494280Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9494455Z [rank0]:E1204 15:04:05.512000 458397 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9494496Z dist init r=0, world=4 2025-12-04T15:04:54.9494646Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9494814Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9495140Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9495303Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9495607Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9495752Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9496046Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9496204Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9496494Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9496662Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9496955Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9497102Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9497400Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9497559Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9498069Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9498190Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9498400Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9498782Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9498904Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9499132Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9499308Z [rank2]:E1204 15:04:05.513000 458399 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9499361Z dist init r=2, world=4 2025-12-04T15:04:54.9499523Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9499691Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9499998Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9500160Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9500516Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9500648Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9500942Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9501111Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9501405Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9501560Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9501854Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9501999Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9502302Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9502457Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9502961Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9503083Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9503292Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9503678Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9503797Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9504038Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9504227Z [rank1]:E1204 15:04:05.527000 458398 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9504268Z dist init r=1, world=4 2025-12-04T15:04:54.9504416Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9504586Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9504901Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9505064Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9505367Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9505501Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9505812Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9505969Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9506263Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9506421Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9506718Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9506864Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9507162Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9507318Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9507829Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9507952Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9508162Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9508547Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9508698Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9508924Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9509098Z [rank3]:E1204 15:04:05.563000 458400 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9509144Z dist init r=3, world=4 2025-12-04T15:04:54.9509186Z FAILED [7.3171s] [100%] 2025-12-04T15:04:54.9509189Z 2025-12-04T15:04:54.9509261Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9509370Z ___ TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda ___ 2025-12-04T15:04:54.9509419Z Traceback (most recent call last): 2025-12-04T15:04:54.9509592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9509640Z self._join_processes(fn) 2025-12-04T15:04:54.9509824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9509881Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9510084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9510130Z raise RuntimeError(error) 2025-12-04T15:04:54.9510254Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9510302Z Traceback (most recent call last): 2025-12-04T15:04:54.9510473Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9510519Z getattr(self, test_name)() 2025-12-04T15:04:54.9510692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9510729Z fn() 2025-12-04T15:04:54.9510888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9510931Z method(*args, **kwargs) 2025-12-04T15:04:54.9511091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9511133Z method(*args, **kwargs) 2025-12-04T15:04:54.9511295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9511334Z with policy(): 2025-12-04T15:04:54.9511497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9511541Z raise RuntimeError(msg) 2025-12-04T15:04:54.9511920Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9511923Z 2025-12-04T15:04:54.9512001Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9512248Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9512250Z 2025-12-04T15:04:54.9512345Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9512348Z 2025-12-04T15:04:54.9512410Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9512476Z Traceback (most recent call last): 2025-12-04T15:04:54.9512647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9512707Z getattr(self, test_name)() 2025-12-04T15:04:54.9512874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9512911Z fn() 2025-12-04T15:04:54.9513072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9513115Z method(*args, **kwargs) 2025-12-04T15:04:54.9513290Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9513334Z method(*args, **kwargs) 2025-12-04T15:04:54.9513493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9513536Z with policy(): 2025-12-04T15:04:54.9513698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9513742Z raise RuntimeError(msg) 2025-12-04T15:04:54.9514129Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9514132Z 2025-12-04T15:04:54.9514212Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9514458Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9514460Z 2025-12-04T15:04:54.9514553Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9514556Z 2025-12-04T15:04:54.9514618Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9514665Z Traceback (most recent call last): 2025-12-04T15:04:54.9514837Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9514881Z getattr(self, test_name)() 2025-12-04T15:04:54.9515049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9515085Z fn() 2025-12-04T15:04:54.9515245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9515287Z method(*args, **kwargs) 2025-12-04T15:04:54.9515447Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9515490Z method(*args, **kwargs) 2025-12-04T15:04:54.9515649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9515689Z with policy(): 2025-12-04T15:04:54.9515850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9515893Z raise RuntimeError(msg) 2025-12-04T15:04:54.9516265Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9516267Z 2025-12-04T15:04:54.9516345Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9516589Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9516617Z 2025-12-04T15:04:54.9516708Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9516711Z 2025-12-04T15:04:54.9516714Z 2025-12-04T15:04:54.9516799Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9516893Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9517140Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2d684093973c4c23.xml - 2025-12-04T15:04:54.9517205Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9517479Z FAILED [7.3171s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9517530Z Traceback (most recent call last): 2025-12-04T15:04:54.9521382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9521441Z getattr(self, test_name)() 2025-12-04T15:04:54.9521622Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9521660Z fn() 2025-12-04T15:04:54.9521858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9521903Z method(*args, **kwargs) 2025-12-04T15:04:54.9522071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9522114Z method(*args, **kwargs) 2025-12-04T15:04:54.9522276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9522318Z with policy(): 2025-12-04T15:04:54.9522484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9522528Z raise RuntimeError(msg) 2025-12-04T15:04:54.9522913Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9522916Z 2025-12-04T15:04:54.9522996Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9523249Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9523251Z 2025-12-04T15:04:54.9523347Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9523350Z 2025-12-04T15:04:54.9523417Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9523466Z Traceback (most recent call last): 2025-12-04T15:04:54.9523640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9523689Z getattr(self, test_name)() 2025-12-04T15:04:54.9523860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9523900Z fn() 2025-12-04T15:04:54.9524061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9524108Z method(*args, **kwargs) 2025-12-04T15:04:54.9524266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9524332Z method(*args, **kwargs) 2025-12-04T15:04:54.9524491Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9524554Z with policy(): 2025-12-04T15:04:54.9524717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9524763Z raise RuntimeError(msg) 2025-12-04T15:04:54.9525138Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9525140Z 2025-12-04T15:04:54.9525245Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9525492Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9525498Z 2025-12-04T15:04:54.9525593Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9525595Z 2025-12-04T15:04:54.9525657Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9525709Z Traceback (most recent call last): 2025-12-04T15:04:54.9525881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9525940Z getattr(self, test_name)() 2025-12-04T15:04:54.9526110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9526150Z fn() 2025-12-04T15:04:54.9526312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9526357Z method(*args, **kwargs) 2025-12-04T15:04:54.9526519Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9526565Z method(*args, **kwargs) 2025-12-04T15:04:54.9526731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9526771Z with policy(): 2025-12-04T15:04:54.9526938Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9526982Z raise RuntimeError(msg) 2025-12-04T15:04:54.9527357Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9527360Z 2025-12-04T15:04:54.9527438Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9527686Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9527689Z 2025-12-04T15:04:54.9527781Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9527852Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9527921Z ======================= 1 failed, 26 deselected in 7.48s ======================= 2025-12-04T15:04:54.9527965Z Got exit code 1 2025-12-04T15:04:54.9528009Z Retrying single test... 2025-12-04T15:04:54.9528214Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-db5e240ffc5d4044.xml 2025-12-04T15:04:54.9528278Z ============================= test session starts ============================== 2025-12-04T15:04:54.9528403Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9528474Z cachedir: .pytest_cache 2025-12-04T15:04:54.9528647Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9528697Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9528746Z configfile: pytest.ini 2025-12-04T15:04:54.9529033Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9529119Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9529374Z stepcurrent: skipping 25 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9529425Z Running 1 items in this shard 2025-12-04T15:04:54.9529427Z 2025-12-04T15:04:54.9529755Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda I1204 15:04:09.015000 458706 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458775 2025-12-04T15:04:54.9529921Z I1204 15:04:09.016000 458706 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458776 2025-12-04T15:04:54.9530084Z I1204 15:04:09.016000 458706 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 458777 2025-12-04T15:04:54.9530300Z I1204 15:04:09.017000 458706 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 458778 2025-12-04T15:04:54.9530693Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9530747Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9531063Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9531136Z {} 2025-12-04T15:04:54.9531250Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9531330Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9531860Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9531929Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9532306Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9532362Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9532741Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9532792Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9533098Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9533184Z {} 2025-12-04T15:04:54.9533315Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9533396Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9533920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9533985Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9534306Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9534372Z {} 2025-12-04T15:04:54.9534485Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9534560Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9535093Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9535157Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9535532Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T15:04:54.9535584Z self.encoder = TransformerEncoder( 2025-12-04T15:04:54.9535889Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T15:04:54.9535956Z {} 2025-12-04T15:04:54.9536065Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T15:04:54.9536143Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T15:04:54.9536658Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9536725Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9536880Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9537058Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9537371Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9537540Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9537847Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9538009Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9538307Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9538465Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9538773Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9538931Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9539229Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9539375Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9539685Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9539847Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9540413Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9540542Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9540752Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9541136Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9541260Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9541490Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9541668Z [rank2]:E1204 15:04:15.116000 458777 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9541711Z dist init r=2, world=4 2025-12-04T15:04:54.9541863Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9542035Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9542344Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9542547Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9542924Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9543058Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9543372Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9543530Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9543825Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9543985Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9544292Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9544441Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9544739Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9544901Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9545410Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9545534Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9545746Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9546129Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9546253Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9546480Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9546657Z [rank1]:E1204 15:04:15.116000 458776 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9546700Z dist init r=1, world=4 2025-12-04T15:04:54.9546849Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9547033Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9547357Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9547522Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9547843Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9547977Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9548271Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9548432Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9548737Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9548896Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9549196Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9549344Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9549645Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9549803Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9550349Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9550472Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9550684Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9551065Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9551188Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9551417Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9551591Z [rank0]:E1204 15:04:15.128000 458775 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9551674Z dist init r=0, world=4 2025-12-04T15:04:54.9551823Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9551996Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9552306Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9552489Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9552795Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9552931Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9553224Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9553399Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9553697Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9553856Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9554153Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9554299Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9554599Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9554759Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9555271Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T15:04:54.9555397Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9555605Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9555988Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9556109Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9556349Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9556539Z [rank3]:E1204 15:04:15.202000 458778 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9556584Z dist init r=3, world=4 2025-12-04T15:04:54.9556626Z FAILED [7.2159s] [100%] 2025-12-04T15:04:54.9556628Z 2025-12-04T15:04:54.9556694Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9556800Z ___ TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda ___ 2025-12-04T15:04:54.9556865Z Traceback (most recent call last): 2025-12-04T15:04:54.9557040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9557092Z self._join_processes(fn) 2025-12-04T15:04:54.9557276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9557337Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9557530Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9557577Z raise RuntimeError(error) 2025-12-04T15:04:54.9557675Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9557724Z Traceback (most recent call last): 2025-12-04T15:04:54.9557901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9557948Z getattr(self, test_name)() 2025-12-04T15:04:54.9558119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9558158Z fn() 2025-12-04T15:04:54.9558322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9558367Z method(*args, **kwargs) 2025-12-04T15:04:54.9558530Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9558573Z method(*args, **kwargs) 2025-12-04T15:04:54.9558737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9558778Z with policy(): 2025-12-04T15:04:54.9558945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9558989Z raise RuntimeError(msg) 2025-12-04T15:04:54.9559368Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9559372Z 2025-12-04T15:04:54.9559457Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9559707Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9559710Z 2025-12-04T15:04:54.9559808Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9559811Z 2025-12-04T15:04:54.9559876Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9559926Z Traceback (most recent call last): 2025-12-04T15:04:54.9560102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9560210Z getattr(self, test_name)() 2025-12-04T15:04:54.9560382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9560439Z fn() 2025-12-04T15:04:54.9560603Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9560648Z method(*args, **kwargs) 2025-12-04T15:04:54.9560807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9560854Z method(*args, **kwargs) 2025-12-04T15:04:54.9561013Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9561069Z with policy(): 2025-12-04T15:04:54.9561231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9561279Z raise RuntimeError(msg) 2025-12-04T15:04:54.9561651Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9561654Z 2025-12-04T15:04:54.9561736Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9561993Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9561996Z 2025-12-04T15:04:54.9562093Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9562097Z 2025-12-04T15:04:54.9562159Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9562210Z Traceback (most recent call last): 2025-12-04T15:04:54.9562384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9562433Z getattr(self, test_name)() 2025-12-04T15:04:54.9562601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9562641Z fn() 2025-12-04T15:04:54.9562800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9562848Z method(*args, **kwargs) 2025-12-04T15:04:54.9563008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9563054Z method(*args, **kwargs) 2025-12-04T15:04:54.9563217Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9563258Z with policy(): 2025-12-04T15:04:54.9563422Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9563467Z raise RuntimeError(msg) 2025-12-04T15:04:54.9563841Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9563843Z 2025-12-04T15:04:54.9563926Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9564172Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9564175Z 2025-12-04T15:04:54.9564266Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9564282Z 2025-12-04T15:04:54.9564284Z 2025-12-04T15:04:54.9564370Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9564476Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9564732Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-db5e240ffc5d4044.xml - 2025-12-04T15:04:54.9564801Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9565068Z FAILED [7.2159s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9565131Z Traceback (most recent call last): 2025-12-04T15:04:54.9565310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9565359Z getattr(self, test_name)() 2025-12-04T15:04:54.9565527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9565568Z fn() 2025-12-04T15:04:54.9565728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9565773Z method(*args, **kwargs) 2025-12-04T15:04:54.9565953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9565998Z method(*args, **kwargs) 2025-12-04T15:04:54.9566157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9566200Z with policy(): 2025-12-04T15:04:54.9566361Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9566409Z raise RuntimeError(msg) 2025-12-04T15:04:54.9566782Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T15:04:54.9566786Z 2025-12-04T15:04:54.9566867Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9567110Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9567112Z 2025-12-04T15:04:54.9567204Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9567207Z 2025-12-04T15:04:54.9567271Z Process 1 exited with error code 10 and exception: 2025-12-04T15:04:54.9567319Z Traceback (most recent call last): 2025-12-04T15:04:54.9567495Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9567541Z getattr(self, test_name)() 2025-12-04T15:04:54.9567712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9567749Z fn() 2025-12-04T15:04:54.9567911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9567954Z method(*args, **kwargs) 2025-12-04T15:04:54.9568116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9568158Z method(*args, **kwargs) 2025-12-04T15:04:54.9568323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9568363Z with policy(): 2025-12-04T15:04:54.9568538Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9568594Z raise RuntimeError(msg) 2025-12-04T15:04:54.9568968Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T15:04:54.9568970Z 2025-12-04T15:04:54.9569048Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9569302Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9569304Z 2025-12-04T15:04:54.9569397Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9569399Z 2025-12-04T15:04:54.9569464Z Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9569513Z Traceback (most recent call last): 2025-12-04T15:04:54.9569688Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9569733Z getattr(self, test_name)() 2025-12-04T15:04:54.9569902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9569939Z fn() 2025-12-04T15:04:54.9570113Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9570159Z method(*args, **kwargs) 2025-12-04T15:04:54.9570367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9570413Z method(*args, **kwargs) 2025-12-04T15:04:54.9570572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9570615Z with policy(): 2025-12-04T15:04:54.9570775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9570822Z raise RuntimeError(msg) 2025-12-04T15:04:54.9571192Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T15:04:54.9571194Z 2025-12-04T15:04:54.9571272Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9571514Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9571517Z 2025-12-04T15:04:54.9571610Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9571680Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9571750Z ======================= 1 failed, 26 deselected in 7.38s ======================= 2025-12-04T15:04:54.9571790Z Got exit code 1 2025-12-04T15:04:54.9571981Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda 2025-12-04T15:04:54.9572118Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.9572322Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3356986c2ee505a2.xml 2025-12-04T15:04:54.9572385Z ============================= test session starts ============================== 2025-12-04T15:04:54.9572509Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9572586Z cachedir: .pytest_cache 2025-12-04T15:04:54.9572759Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9572812Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9572855Z configfile: pytest.ini 2025-12-04T15:04:54.9573028Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9573111Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9573168Z stepcurrent: skipping 26 already run items. 2025-12-04T15:04:54.9573216Z Running 1 items in this shard 2025-12-04T15:04:54.9573232Z 2025-12-04T15:04:54.9573530Z distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda I1204 15:04:18.815000 459084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459153 2025-12-04T15:04:54.9573699Z I1204 15:04:18.815000 459084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459154 2025-12-04T15:04:54.9573861Z I1204 15:04:18.816000 459084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 459155 2025-12-04T15:04:54.9574024Z I1204 15:04:18.816000 459084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 459156 2025-12-04T15:04:54.9574580Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9574648Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9575170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9575234Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9575757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9575819Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9576333Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9576394Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9576704Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9576750Z return func(*args, **kwargs) 2025-12-04T15:04:54.9576902Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9577077Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9577410Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9577576Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9577883Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9578029Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9578324Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9578483Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9578787Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9578944Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9579243Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9579389Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9579690Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9579849Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9580388Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.9580513Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9580723Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9581076Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9581198Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9581428Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9581606Z [rank3]:E1204 15:04:26.261000 459156 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9581667Z dist init r=3, world=4 2025-12-04T15:04:54.9581831Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9582001Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9582311Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9582487Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9582794Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9582927Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9583222Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9583394Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9583690Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9583848Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9584141Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9584288Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9584585Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9584747Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9585224Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.9585349Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9585556Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9585906Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9586026Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9586250Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9586453Z [rank0]:E1204 15:04:26.265000 459153 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9586494Z dist init r=0, world=4 2025-12-04T15:04:54.9586641Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9586811Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9587130Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9587296Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9587601Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9587733Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9588037Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9588195Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9588488Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9588646Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9588938Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9589084Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9589380Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9589538Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9590020Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.9590142Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9590410Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9590759Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9590907Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9591132Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9591306Z [rank1]:E1204 15:04:26.302000 459154 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9593036Z dist init r=1, world=4 2025-12-04T15:04:54.9593185Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9593379Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9593684Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9593852Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9594173Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9594324Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9594622Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9594778Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9595075Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9595231Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9595525Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9595675Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9595969Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9596127Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9596610Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.9596734Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9596946Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9597309Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9597430Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9597656Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9597908Z [rank2]:E1204 15:04:26.309000 459155 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9597949Z dist init r=2, world=4 2025-12-04T15:04:54.9598311Z [rank0]:[W1204 15:04:26.959477766 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9598354Z FAILED [9.3208s] [100%] 2025-12-04T15:04:54.9598357Z 2025-12-04T15:04:54.9598418Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9598513Z _____________ TestAutogradCUDA.test_unshard_params_as_tensors_cuda _____________ 2025-12-04T15:04:54.9598564Z Traceback (most recent call last): 2025-12-04T15:04:54.9598750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9598801Z self._join_processes(fn) 2025-12-04T15:04:54.9598991Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9599048Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9599240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9599287Z raise RuntimeError(error) 2025-12-04T15:04:54.9599373Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9599420Z Traceback (most recent call last): 2025-12-04T15:04:54.9599593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9599638Z getattr(self, test_name)() 2025-12-04T15:04:54.9599809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9599846Z fn() 2025-12-04T15:04:54.9600009Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9600053Z method(*args, **kwargs) 2025-12-04T15:04:54.9600259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9600303Z method(*args, **kwargs) 2025-12-04T15:04:54.9600465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9600504Z with policy(): 2025-12-04T15:04:54.9600667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9600712Z raise RuntimeError(msg) 2025-12-04T15:04:54.9601060Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.9601062Z 2025-12-04T15:04:54.9601141Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9601359Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9601380Z 2025-12-04T15:04:54.9601474Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9601478Z 2025-12-04T15:04:54.9601480Z 2025-12-04T15:04:54.9601561Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9601657Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9601931Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3356986c2ee505a2.xml - 2025-12-04T15:04:54.9602016Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9602252Z FAILED [9.3208s] distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9602304Z Traceback (most recent call last): 2025-12-04T15:04:54.9602479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9602526Z getattr(self, test_name)() 2025-12-04T15:04:54.9602695Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9602733Z fn() 2025-12-04T15:04:54.9602908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9602954Z method(*args, **kwargs) 2025-12-04T15:04:54.9603115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9603158Z method(*args, **kwargs) 2025-12-04T15:04:54.9603317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9603358Z with policy(): 2025-12-04T15:04:54.9603518Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9603563Z raise RuntimeError(msg) 2025-12-04T15:04:54.9603911Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.9603915Z 2025-12-04T15:04:54.9603995Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9604211Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9604214Z 2025-12-04T15:04:54.9604306Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9604378Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9604444Z ======================= 1 failed, 26 deselected in 9.48s ======================= 2025-12-04T15:04:54.9604484Z Got exit code 1 2025-12-04T15:04:54.9604527Z Retrying single test... 2025-12-04T15:04:54.9604730Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-11e6e5ffde7b4e85.xml 2025-12-04T15:04:54.9604792Z ============================= test session starts ============================== 2025-12-04T15:04:54.9604914Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9604958Z cachedir: .pytest_cache 2025-12-04T15:04:54.9605127Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9605176Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9605235Z configfile: pytest.ini 2025-12-04T15:04:54.9605408Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9605489Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9605696Z stepcurrent: skipping 26 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9605745Z Running 1 items in this shard 2025-12-04T15:04:54.9605761Z 2025-12-04T15:04:54.9606073Z distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda I1204 15:04:30.654000 459486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459555 2025-12-04T15:04:54.9606241Z I1204 15:04:30.654000 459486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459556 2025-12-04T15:04:54.9606402Z I1204 15:04:30.655000 459486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 459557 2025-12-04T15:04:54.9606565Z I1204 15:04:30.656000 459486 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 459558 2025-12-04T15:04:54.9607107Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9607176Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9607694Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9607758Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9608274Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9608336Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9608855Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9608918Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9609227Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9609274Z return func(*args, **kwargs) 2025-12-04T15:04:54.9609427Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9609600Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9609906Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9610097Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9610450Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9610599Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9610908Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9611065Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9611363Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9611535Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9611831Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9611977Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9612273Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9612433Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9612916Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.9613041Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9613252Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9613603Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9613725Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9613952Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9614129Z [rank2]:E1204 15:04:38.037000 459557 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9614171Z dist init r=2, world=4 2025-12-04T15:04:54.9614321Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9614509Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9614817Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9614981Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9615312Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9615445Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9615740Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9615897Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9616204Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9616364Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9616657Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9616804Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9617104Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9617264Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9617741Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.9617864Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9618074Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9618422Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9618544Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9618770Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9618945Z [rank0]:E1204 15:04:38.038000 459555 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9618998Z dist init r=0, world=4 2025-12-04T15:04:54.9619146Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9619315Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9619625Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9619816Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9620119Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9620290Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9620585Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9620757Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9621051Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9621208Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9621500Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9621646Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9621944Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9622104Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9622584Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.9622705Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9622913Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9623262Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9623383Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9623608Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9623799Z [rank3]:E1204 15:04:38.041000 459558 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9623842Z dist init r=3, world=4 2025-12-04T15:04:54.9623988Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9624173Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9624498Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9624662Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9624964Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9625115Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9625411Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9625568Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9625862Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9626020Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9626315Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9626459Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9626755Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9626913Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9627390Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.9627511Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9627720Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9628068Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9628199Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9628424Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9628598Z [rank1]:E1204 15:04:38.095000 459556 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9628653Z dist init r=1, world=4 2025-12-04T15:04:54.9629024Z [rank0]:[W1204 15:04:38.717889038 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9629068Z FAILED [9.3192s] [100%] 2025-12-04T15:04:54.9629073Z 2025-12-04T15:04:54.9629132Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9629229Z _____________ TestAutogradCUDA.test_unshard_params_as_tensors_cuda _____________ 2025-12-04T15:04:54.9629278Z Traceback (most recent call last): 2025-12-04T15:04:54.9629455Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9629513Z self._join_processes(fn) 2025-12-04T15:04:54.9629698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9629756Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9629948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9629996Z raise RuntimeError(error) 2025-12-04T15:04:54.9630082Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9630130Z Traceback (most recent call last): 2025-12-04T15:04:54.9630342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9630388Z getattr(self, test_name)() 2025-12-04T15:04:54.9630557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9630595Z fn() 2025-12-04T15:04:54.9630758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9630804Z method(*args, **kwargs) 2025-12-04T15:04:54.9630964Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9631008Z method(*args, **kwargs) 2025-12-04T15:04:54.9631168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9631209Z with policy(): 2025-12-04T15:04:54.9631370Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9631415Z raise RuntimeError(msg) 2025-12-04T15:04:54.9631763Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.9631766Z 2025-12-04T15:04:54.9631848Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9632062Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9632064Z 2025-12-04T15:04:54.9632175Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9632177Z 2025-12-04T15:04:54.9632179Z 2025-12-04T15:04:54.9632261Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9632354Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9632605Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-11e6e5ffde7b4e85.xml - 2025-12-04T15:04:54.9632685Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9632934Z FAILED [9.3192s] distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T15:04:54.9632983Z Traceback (most recent call last): 2025-12-04T15:04:54.9633157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9633203Z getattr(self, test_name)() 2025-12-04T15:04:54.9633373Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9633410Z fn() 2025-12-04T15:04:54.9633570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9633628Z method(*args, **kwargs) 2025-12-04T15:04:54.9633791Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9633834Z method(*args, **kwargs) 2025-12-04T15:04:54.9633993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9634032Z with policy(): 2025-12-04T15:04:54.9634195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9634239Z raise RuntimeError(msg) 2025-12-04T15:04:54.9634586Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.9634588Z 2025-12-04T15:04:54.9634667Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9634882Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9634885Z 2025-12-04T15:04:54.9634979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9635045Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9635113Z ======================= 1 failed, 26 deselected in 9.48s ======================= 2025-12-04T15:04:54.9635152Z Got exit code 1 2025-12-04T15:04:54.9635195Z Retrying single test... 2025-12-04T15:04:54.9635395Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-026c789416e56554.xml 2025-12-04T15:04:54.9635459Z ============================= test session starts ============================== 2025-12-04T15:04:54.9635580Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9635625Z cachedir: .pytest_cache 2025-12-04T15:04:54.9635796Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9635847Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9635890Z configfile: pytest.ini 2025-12-04T15:04:54.9636064Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9636161Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T15:04:54.9636371Z stepcurrent: skipping 26 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9636418Z Running 1 items in this shard 2025-12-04T15:04:54.9636421Z 2025-12-04T15:04:54.9636721Z distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda I1204 15:04:42.414000 459888 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459957 2025-12-04T15:04:54.9636914Z I1204 15:04:42.415000 459888 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459958 2025-12-04T15:04:54.9637077Z I1204 15:04:42.415000 459888 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 459959 2025-12-04T15:04:54.9637238Z I1204 15:04:42.416000 459888 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 459960 2025-12-04T15:04:54.9637780Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9637849Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9638369Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9638432Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9638948Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9639009Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9639526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T15:04:54.9639586Z device_from_device_id = _get_device_from_device_id( 2025-12-04T15:04:54.9639895Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T15:04:54.9639942Z return func(*args, **kwargs) 2025-12-04T15:04:54.9640095Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9640304Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9640611Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9640797Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9641099Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9641234Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9641565Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9641726Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9642025Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9642183Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9642494Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9642642Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9642936Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9643095Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9643575Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T15:04:54.9643697Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9643905Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9644257Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9644379Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9644605Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9644779Z [rank3]:E1204 15:04:49.962000 459960 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T15:04:54.9644824Z dist init r=3, world=4 2025-12-04T15:04:54.9644971Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9645142Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9645461Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9645625Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9645932Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9646088Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9646385Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9646542Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9646841Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9647007Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9647304Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9647448Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9647743Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9647902Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9648379Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.9648501Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9648708Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9649057Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9649179Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9649409Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9649585Z [rank2]:E1204 15:04:49.971000 459959 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T15:04:54.9649626Z dist init r=2, world=4 2025-12-04T15:04:54.9649773Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9649956Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9650296Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9650474Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9650791Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9650923Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9651221Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9651377Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9651689Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9651849Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9652141Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9652288Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9652584Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9652744Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9653220Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T15:04:54.9653342Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9653550Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9653897Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9654019Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9654245Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9654438Z [rank0]:E1204 15:04:49.976000 459957 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T15:04:54.9654480Z dist init r=0, world=4 2025-12-04T15:04:54.9654628Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T15:04:54.9654798Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T15:04:54.9655127Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9655291Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T15:04:54.9655596Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9655728Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T15:04:54.9656033Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9656195Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9656489Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9656650Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T15:04:54.9656944Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9657091Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T15:04:54.9657392Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9657549Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T15:04:54.9658026Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T15:04:54.9658148Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9658357Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9658706Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9658827Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T15:04:54.9659067Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9659245Z [rank1]:E1204 15:04:49.990000 459958 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T15:04:54.9659288Z dist init r=1, world=4 2025-12-04T15:04:54.9659677Z [rank0]:[W1204 15:04:50.753056754 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T15:04:54.9659720Z FAILED [9.4192s] [100%] 2025-12-04T15:04:54.9659723Z 2025-12-04T15:04:54.9659782Z =================================== FAILURES =================================== 2025-12-04T15:04:54.9659880Z _____________ TestAutogradCUDA.test_unshard_params_as_tensors_cuda _____________ 2025-12-04T15:04:54.9659928Z Traceback (most recent call last): 2025-12-04T15:04:54.9660103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T15:04:54.9660149Z self._join_processes(fn) 2025-12-04T15:04:54.9660394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T15:04:54.9660452Z self._check_return_codes(fn, elapsed_time) 2025-12-04T15:04:54.9660645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T15:04:54.9660692Z raise RuntimeError(error) 2025-12-04T15:04:54.9660778Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9660825Z Traceback (most recent call last): 2025-12-04T15:04:54.9661002Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9661047Z getattr(self, test_name)() 2025-12-04T15:04:54.9661217Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9661256Z fn() 2025-12-04T15:04:54.9661423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9661467Z method(*args, **kwargs) 2025-12-04T15:04:54.9661629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9661672Z method(*args, **kwargs) 2025-12-04T15:04:54.9661833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9661872Z with policy(): 2025-12-04T15:04:54.9662037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9662079Z raise RuntimeError(msg) 2025-12-04T15:04:54.9662424Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.9662427Z 2025-12-04T15:04:54.9662508Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9662720Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9662723Z 2025-12-04T15:04:54.9662817Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9662819Z 2025-12-04T15:04:54.9662821Z 2025-12-04T15:04:54.9662919Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T15:04:54.9663015Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T15:04:54.9663260Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-026c789416e56554.xml - 2025-12-04T15:04:54.9663329Z =========================== short test summary info ============================ 2025-12-04T15:04:54.9663579Z FAILED [9.4192s] distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T15:04:54.9663646Z Traceback (most recent call last): 2025-12-04T15:04:54.9663821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T15:04:54.9663866Z getattr(self, test_name)() 2025-12-04T15:04:54.9664037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T15:04:54.9664077Z fn() 2025-12-04T15:04:54.9664238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9664283Z method(*args, **kwargs) 2025-12-04T15:04:54.9664456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T15:04:54.9664500Z method(*args, **kwargs) 2025-12-04T15:04:54.9664661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T15:04:54.9664701Z with policy(): 2025-12-04T15:04:54.9664861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T15:04:54.9664904Z raise RuntimeError(msg) 2025-12-04T15:04:54.9665249Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestAutogradCUDA.test_unshard_params_as_tensors_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T15:04:54.9665251Z 2025-12-04T15:04:54.9665329Z To execute this test, run the following from the base repo dir: 2025-12-04T15:04:54.9665542Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestAutogradCUDA.test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9665545Z 2025-12-04T15:04:54.9665637Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T15:04:54.9665709Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T15:04:54.9665775Z ======================= 1 failed, 26 deselected in 9.56s ======================= 2025-12-04T15:04:54.9665815Z Got exit code 1 2025-12-04T15:04:54.9665979Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda 2025-12-04T15:04:54.9666120Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T15:04:54.9666330Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f0aab1b3afefef97.xml 2025-12-04T15:04:54.9666394Z ============================= test session starts ============================== 2025-12-04T15:04:54.9666516Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T15:04:54.9666561Z cachedir: .pytest_cache 2025-12-04T15:04:54.9666732Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T15:04:54.9666784Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T15:04:54.9666825Z configfile: pytest.ini 2025-12-04T15:04:54.9666998Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T15:04:54.9667094Z collecting ... collected 60 items / 27 deselected / 33 selected 2025-12-04T15:04:54.9667149Z stepcurrent: skipping 27 already run items. 2025-12-04T15:04:54.9667195Z Running 0 items in this shard 2025-12-04T15:04:54.9667200Z 2025-12-04T15:04:54.9667450Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f0aab1b3afefef97.xml - 2025-12-04T15:04:54.9667535Z ============================ 27 deselected in 0.01s ============================ 2025-12-04T15:04:54.9672348Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_False_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda'] 2025-12-04T15:04:54.9672373Z 2025-12-04T15:04:54.9672571Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 2/2 (test/test-reports/distributed.fsdp.test_fsdp_core_2.2_b1f81712a7b176a7_.log) 2025-12-04T15:04:54.9672573Z 2025-12-04T15:04:54.9672707Z Finished distributed/fsdp/test_fsdp_core 2/2 ... [2025-12-04 15:04:54.632206][2245805.122180826], took 26.93min 2025-12-04T15:04:54.9672986Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T15:04:54.9673100Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T15:04:54.9673217Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T15:04:54.9673270Z Uploading artifacts took 0.00 seconds 2025-12-04T15:04:54.9673325Z distributed/fsdp/test_fsdp_core 2/2 failed! 2025-12-04T15:04:54.9673433Z Running distributed/test_c10d_ucc 1/1 ... [2025-12-04 15:04:54.636205][2245805.126186064] 2025-12-04T15:04:54.9673486Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T15:04:54.9673835Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_ucc.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 15:04:54.636405] 2025-12-04T15:04:55.4811891Z 2025-12-04T15:04:55.4812340Z distributed/test_c10d_ucc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_ucc_1.1_e48a1f07adebacb6_.log 2025-12-04T15:04:55.4812601Z 2025-12-04T15:04:55.4812722Z Finished distributed/test_c10d_ucc 1/1 ... [2025-12-04 15:04:55.481048][2245805.971023369], took 0.01min 2025-12-04T15:04:55.4839548Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T15:04:55.4856178Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T15:04:55.4859204Z Running distributed/test_c10d_common 1/1 ... [2025-12-04 15:04:55.485800][2245805.975781587] 2025-12-04T15:04:55.4859539Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T15:04:55.4861295Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_common.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 15:04:55.486021] 2025-12-04T15:07:10.3449081Z 2025-12-04T15:07:10.3450395Z distributed/test_c10d_common 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_common_1.1_d7fb409a777d6cd6_.log 2025-12-04T15:07:10.3458761Z Running 27 items in this shard: test/distributed/test_c10d_common.py::TimeoutTest::test_store_based_barrier, test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_multi_limit_multi_dtype, test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_multi_limit_single_dtype, test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_single_limit_multi_dtype, test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_single_limit_single_dtype, test/distributed/test_c10d_common.py::CommTest::test_debug_level, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_abort, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_backend_class_attr, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_backend_config, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_canonicalize_helper, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_collectives, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_get_backend_name, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_init_process_group_with_multiple_backends, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_is_backend_available, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_send_recv, test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_shutdown, test/distributed/test_c10d_common.py::ProcessGroupWithDispatchedCollectivesTests::test_default_process_group, test/distributed/test_c10d_common.py::ProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends, test/distributed/test_c10d_common.py::ProcessGroupWithDispatchedCollectivesTests::test_init_process_group_optional_backend, test/distributed/test_c10d_common.py::ReduceOpTest::test_op_isinstance_of_reduceop, test/distributed/test_c10d_common.py::ReduceOpTest::test_reduceop_copyable, test/distributed/test_c10d_common.py::ReduceOpTest::test_reduceop_equal, test/distributed/test_c10d_common.py::ReduceOpTest::test_reduceop_pickle, test/distributed/test_c10d_common.py::LocalRankTest::testNodeLocalRank, test/distributed/test_c10d_common.py::LocalRankTest::testNodeLocalRankOverridesFallback, test/distributed/test_c10d_common.py::LocalRankTest::testWithoutEnv, test/distributed/test_c10d_common.py::LocalRankTest::testWithoutEnvWithFallback 2025-12-04T15:07:10.3465251Z Running 1 items in this shard: test/distributed/test_c10d_common.py::TimeoutTest::test_store_based_barrier 2025-12-04T15:07:10.3465807Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_multi_limit_multi_dtype 2025-12-04T15:07:10.3466329Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_multi_limit_single_dtype 2025-12-04T15:07:10.3466836Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_single_limit_multi_dtype 2025-12-04T15:07:10.3467343Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ComputeBucketAssignmentTest::test_single_limit_single_dtype 2025-12-04T15:07:10.3467791Z Running 1 items in this shard: test/distributed/test_c10d_common.py::CommTest::test_debug_level 2025-12-04T15:07:10.3468201Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_abort 2025-12-04T15:07:10.3468689Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_backend_class_attr 2025-12-04T15:07:10.3469175Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_backend_config 2025-12-04T15:07:10.3469675Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_canonicalize_helper 2025-12-04T15:07:10.3470158Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_collectives 2025-12-04T15:07:10.3470685Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_get_backend_name 2025-12-04T15:07:10.3471230Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_init_process_group_with_multiple_backends 2025-12-04T15:07:10.3471784Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_is_backend_available 2025-12-04T15:07:10.3472165Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_send_recv 2025-12-04T15:07:10.3472530Z Running 1 items in this shard: test/distributed/test_c10d_common.py::PythonProcessGroupExtensionTest::test_shutdown 2025-12-04T15:07:10.3472940Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ProcessGroupWithDispatchedCollectivesTests::test_default_process_group 2025-12-04T15:07:10.3473420Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends 2025-12-04T15:07:10.3473948Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ProcessGroupWithDispatchedCollectivesTests::test_init_process_group_optional_backend 2025-12-04T15:07:10.3474376Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ReduceOpTest::test_op_isinstance_of_reduceop 2025-12-04T15:07:10.3474706Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ReduceOpTest::test_reduceop_copyable 2025-12-04T15:07:10.3475024Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ReduceOpTest::test_reduceop_equal 2025-12-04T15:07:10.3475364Z Running 1 items in this shard: test/distributed/test_c10d_common.py::ReduceOpTest::test_reduceop_pickle 2025-12-04T15:07:10.3475696Z Running 1 items in this shard: test/distributed/test_c10d_common.py::LocalRankTest::testNodeLocalRank 2025-12-04T15:07:10.3476037Z Running 1 items in this shard: test/distributed/test_c10d_common.py::LocalRankTest::testNodeLocalRankOverridesFallback 2025-12-04T15:07:10.3476380Z Running 1 items in this shard: test/distributed/test_c10d_common.py::LocalRankTest::testWithoutEnv 2025-12-04T15:07:10.3476700Z Running 1 items in this shard: test/distributed/test_c10d_common.py::LocalRankTest::testWithoutEnvWithFallback 2025-12-04T15:07:10.3476892Z 2025-12-04T15:07:10.3477026Z Finished distributed/test_c10d_common 1/1 ... [2025-12-04 15:07:10.345106][2245940.835080595], took 2.25min 2025-12-04T15:07:10.3478267Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T15:07:10.3493994Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T15:07:10.3496346Z Running distributed/fsdp/test_fsdp_mixed_precision 1/1 ... [2025-12-04 15:07:10.349520][2245940.839501827] 2025-12-04T15:07:10.3496563Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T15:07:10.3498402Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_mixed_precision.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 15:07:10.349738] 2025-12-04T15:13:41.1862891Z 2025-12-04T15:13:41.1863771Z distributed/fsdp/test_fsdp_mixed_precision 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_mixed_precision_1.1_002cad0409afd4c1_.log 2025-12-04T15:13:41.1878451Z Running 66 items in this shard: test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_buffer_dtype_no_root_handle, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_eval_root_cast_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval_buffers, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval_comm, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_grads_reduced_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_input_grads_with_param_mixed_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_no_reshard_after_forward, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_resnet, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_batchnorm_convert_sync_bn_False, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_batchnorm_convert_sync_bn_True, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_default, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_only_params_and_bufs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_params_and_reduce_diff, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_reduce, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_grads_reduced_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_mixed_precision_e2e_full_shard, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_mixed_precision_no_reshard_after_forward, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionIgnoredModules::test_mixed_precision_with_ignored_module, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule_skip_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule_skip_inputs_error, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions_error, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_external_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPTrainEval::test_train_ema_eval_flow 2025-12-04T15:13:41.1892328Z 2025-12-04T15:13:41.1892475Z Finished distributed/fsdp/test_fsdp_mixed_precision 1/1 ... [2025-12-04 15:13:41.186437][2246331.676413084], took 6.51min 2025-12-04T15:13:41.1894394Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T15:13:41.1909330Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T15:13:41.1912473Z Running distributed/test_c10d_nccl 2/2 ... [2025-12-04 15:13:41.191009][2246331.680989236] 2025-12-04T15:13:41.1912939Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T15:13:41.1913939Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_nccl.py', '--shard-id=2', '--num-shards=2', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 15:13:41.191203] 2025-12-04T15:30:54.6915360Z 2025-12-04T15:30:54.6916158Z distributed/test_c10d_nccl 2/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_nccl_2.2_4bb7f464c909b0d4_.log 2025-12-04T15:30:54.6950762Z Running 142 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLInitTest::test_init_wo_backend_str, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_multi_pgs, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_block_current_stream, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_close_pg_eager_init_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_eager_init_subgroup, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_cuda_event_cache_mthd_race, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_deterministic_mode_no_break, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_get_uid, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_init_process_group_nccl_timeout, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_init_with_idx, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_True, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_init, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_with_eager_init, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend_nccl, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_backend_properties, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_basic, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_flags, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_nccl_config, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_performance, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_validation, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_dataclass_output, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_dataclass_output_unused_param, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_complex_params, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_mixed_real_and_complex_params, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_packed_sequence, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_with_lazy_parameters, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_fp16_compress_wrapper_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_fp16_compress_wrapper_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_fp16_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_1devicemodule_1replicaperprocess, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_2gpu_module, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_4gpu_module, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_ids_not_allowed, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_single_device_module_empty_device_ids, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_propagate_error_reason, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_no_grad, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_param_layout_mismatch_error, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_all_gather_object, test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_broadcast, test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops, test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_seq, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_blocking_wait_with_barrier, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_blocking, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_non_blocking_wait_with_barrier, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_restart_pg_after_error, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_send_recv_non_dense_tensor, test/distributed/test_c10d_nccl.py::NcclUserBufferRegistrationTest::test_nccl_window_registration, test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_detail, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_info, test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership, test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_config, test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream, test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_nccl.py::CommTest::test_sequence_num_incremented_nccl_default, test/distributed/test_c10d_nccl.py::CommTest::test_sequence_num_set_nccl_new_group, test/distributed/test_c10d_nccl.py::CommTest::test_tensor_dtype_complex, test/distributed/test_c10d_nccl.py::CommTest::test_tensor_dtype_mismatch, test/distributed/test_c10d_nccl.py::CommTest::test_wait_tensor, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_all_to_all_single, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_allreduce_coalesced, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_default_process_group, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_init_process_group_optional_backend, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device1_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device1_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync, test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync_duplicated_pg, test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync_sanity_check, test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_False_async_op_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_False, test/distributed/test_c10d_nccl.py::SparseCollective::test_ddp_set_sparse_metadata, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_allgather_uneven_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_barrier_profiling, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_coalescing_manager_collective_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_dump_pipe, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_circular_buffer_full_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes1_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_long, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_True_include_collectives_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_False_only_active_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_all_works_retired, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_split_group_larger_scale 2025-12-04T15:30:54.6970467Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl 2025-12-04T15:30:54.6970786Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLInitTest::test_init_wo_backend_str 2025-12-04T15:30:54.6971124Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_multi_pgs 2025-12-04T15:30:54.6971458Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg 2025-12-04T15:30:54.6971780Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_block_current_stream 2025-12-04T15:30:54.6972146Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_close_pg_eager_init_False 2025-12-04T15:30:54.6972486Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_eager_init_subgroup 2025-12-04T15:30:54.6972824Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_cuda_event_cache_mthd_race 2025-12-04T15:30:54.6973169Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_deterministic_mode_no_break 2025-12-04T15:30:54.6973576Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0 2025-12-04T15:30:54.6973902Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_get_uid 2025-12-04T15:30:54.6974224Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_init_process_group_nccl_timeout 2025-12-04T15:30:54.6974558Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_init_with_idx 2025-12-04T15:30:54.6974857Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check 2025-12-04T15:30:54.6975175Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False 2025-12-04T15:30:54.6975537Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_True 2025-12-04T15:30:54.6975866Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_init 2025-12-04T15:30:54.6976181Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p 2025-12-04T15:30:54.6976512Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_with_eager_init 2025-12-04T15:30:54.6976868Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend_nccl 2025-12-04T15:30:54.6977213Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc 2025-12-04T15:30:54.6977557Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_backend_properties 2025-12-04T15:30:54.6977897Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_basic 2025-12-04T15:30:54.6978214Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_flags 2025-12-04T15:30:54.6978540Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_nccl_config 2025-12-04T15:30:54.6978876Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_performance 2025-12-04T15:30:54.6979212Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_validation 2025-12-04T15:30:54.6979554Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_False 2025-12-04T15:30:54.6979906Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True 2025-12-04T15:30:54.6980305Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module 2025-12-04T15:30:54.6980684Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view 2025-12-04T15:30:54.6981062Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl 2025-12-04T15:30:54.6981428Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl 2025-12-04T15:30:54.6981766Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_dataclass_output 2025-12-04T15:30:54.6982103Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_dataclass_output_unused_param 2025-12-04T15:30:54.6982484Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module 2025-12-04T15:30:54.6982883Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False 2025-12-04T15:30:54.6983278Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True 2025-12-04T15:30:54.6983692Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_False 2025-12-04T15:30:54.6984107Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False 2025-12-04T15:30:54.6984514Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True 2025-12-04T15:30:54.6984919Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_False 2025-12-04T15:30:54.6985336Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True 2025-12-04T15:30:54.6985749Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True 2025-12-04T15:30:54.6986162Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view 2025-12-04T15:30:54.6986564Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph 2025-12-04T15:30:54.6986931Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_complex_params 2025-12-04T15:30:54.6987282Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_mixed_real_and_complex_params 2025-12-04T15:30:54.6987633Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_packed_sequence 2025-12-04T15:30:54.6987966Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_with_lazy_parameters 2025-12-04T15:30:54.6988312Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl 2025-12-04T15:30:54.6988673Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl_is_view 2025-12-04T15:30:54.6989055Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail 2025-12-04T15:30:54.6989430Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_fp16_compress_wrapper_is_view 2025-12-04T15:30:54.6989784Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_fp16_compress_wrapper_nccl 2025-12-04T15:30:54.6990118Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_fp16_grad_is_view 2025-12-04T15:30:54.6990512Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_1devicemodule_1replicaperprocess 2025-12-04T15:30:54.6990900Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule 2025-12-04T15:30:54.6991258Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward 2025-12-04T15:30:54.6991651Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward_grad_is_view 2025-12-04T15:30:54.6992093Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list 2025-12-04T15:30:54.6992509Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list 2025-12-04T15:30:54.6992891Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_2gpu_module 2025-12-04T15:30:54.6993232Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_4gpu_module 2025-12-04T15:30:54.6993597Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_ids_not_allowed 2025-12-04T15:30:54.6994014Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_single_device_module_empty_device_ids 2025-12-04T15:30:54.6994400Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_propagate_error_reason 2025-12-04T15:30:54.6994722Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_no_grad 2025-12-04T15:30:54.6995047Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_param_layout_mismatch_error 2025-12-04T15:30:54.6995384Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg 2025-12-04T15:30:54.6995716Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl 2025-12-04T15:30:54.6996084Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view 2025-12-04T15:30:54.6996457Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input 2025-12-04T15:30:54.6996818Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_sync_batch_norm_only_empty_input 2025-12-04T15:30:54.6997166Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_all_gather_object 2025-12-04T15:30:54.6997478Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_broadcast 2025-12-04T15:30:54.6997781Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops 2025-12-04T15:30:54.6998078Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_seq 2025-12-04T15:30:54.6998394Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_blocking_wait_with_barrier 2025-12-04T15:30:54.6998719Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_blocking 2025-12-04T15:30:54.6999052Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_non_blocking_wait_with_barrier 2025-12-04T15:30:54.6999383Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_restart_pg_after_error 2025-12-04T15:30:54.6999717Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_send_recv_non_dense_tensor 2025-12-04T15:30:54.7000056Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclUserBufferRegistrationTest::test_nccl_window_registration 2025-12-04T15:30:54.7000424Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl 2025-12-04T15:30:54.7000711Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl 2025-12-04T15:30:54.7001021Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_detail 2025-12-04T15:30:54.7001342Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_info 2025-12-04T15:30:54.7001634Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership 2025-12-04T15:30:54.7001913Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_config 2025-12-04T15:30:54.7002215Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream 2025-12-04T15:30:54.7002524Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_tensor_coalesced 2025-12-04T15:30:54.7002843Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_sequence_num_incremented_nccl_default 2025-12-04T15:30:54.7003154Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_sequence_num_set_nccl_new_group 2025-12-04T15:30:54.7003443Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_tensor_dtype_complex 2025-12-04T15:30:54.7003719Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_tensor_dtype_mismatch 2025-12-04T15:30:54.7003987Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_wait_tensor 2025-12-04T15:30:54.7004323Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_all_to_all_single 2025-12-04T15:30:54.7004722Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_allreduce_coalesced 2025-12-04T15:30:54.7005125Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_default_process_group 2025-12-04T15:30:54.7005553Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends 2025-12-04T15:30:54.7006002Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_init_process_group_optional_backend 2025-12-04T15:30:54.7006421Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_False 2025-12-04T15:30:54.7006812Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device1_group_rank_False 2025-12-04T15:30:54.7007237Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device1_group_rank_True 2025-12-04T15:30:54.7007593Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False 2025-12-04T15:30:54.7007925Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False 2025-12-04T15:30:54.7008261Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_True 2025-12-04T15:30:54.7008588Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False 2025-12-04T15:30:54.7008918Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync 2025-12-04T15:30:54.7009223Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync_duplicated_pg 2025-12-04T15:30:54.7009542Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync_sanity_check 2025-12-04T15:30:54.7009859Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_subgroup_group_rank_False 2025-12-04T15:30:54.7010282Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_False 2025-12-04T15:30:54.7010666Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True 2025-12-04T15:30:54.7011048Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_False 2025-12-04T15:30:54.7011422Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_False_async_op_True 2025-12-04T15:30:54.7011804Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_False 2025-12-04T15:30:54.7012154Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::SparseCollective::test_ddp_set_sparse_metadata 2025-12-04T15:30:54.7012471Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_allgather_uneven_timing_enabled_False 2025-12-04T15:30:54.7012781Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_barrier_profiling 2025-12-04T15:30:54.7013119Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False 2025-12-04T15:30:54.7013515Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False 2025-12-04T15:30:54.7013910Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True 2025-12-04T15:30:54.7014300Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_coalescing_manager_collective_timing_enabled_True 2025-12-04T15:30:54.7014622Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_dump_pipe 2025-12-04T15:30:54.7014937Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_False 2025-12-04T15:30:54.7015296Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_True 2025-12-04T15:30:54.7015669Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_circular_buffer_full_timing_enabled_True 2025-12-04T15:30:54.7016030Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_timing_enabled_False 2025-12-04T15:30:54.7016383Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False 2025-12-04T15:30:54.7016754Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True 2025-12-04T15:30:54.7017128Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes1_timing_enabled_True 2025-12-04T15:30:54.7017443Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_long 2025-12-04T15:30:54.7017764Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True 2025-12-04T15:30:54.7018161Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False 2025-12-04T15:30:54.7018542Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_True 2025-12-04T15:30:54.7018921Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_True_include_collectives_False 2025-12-04T15:30:54.7019331Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_False_only_active_False 2025-12-04T15:30:54.7019711Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_True 2025-12-04T15:30:54.7020054Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_all_works_retired 2025-12-04T15:30:54.7020446Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_split_group_larger_scale 2025-12-04T15:30:54.7020655Z 2025-12-04T15:30:54.7020772Z Finished distributed/test_c10d_nccl 2/2 ... [2025-12-04 15:30:54.692964][2247365.182941774], took 17.23min 2025-12-04T15:30:54.7021201Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-40da44670c5524ca.xml 2025-12-04T15:30:54.7021598Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T15:30:54.7021823Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T15:30:54.7022010Z Uploading artifacts took 0.00 seconds 2025-12-04T15:30:56.8207269Z Running test batch 'tests to run' cost 9384.16 seconds 2025-12-04T15:30:56.8211474Z Emitting td_test_failure_stats_v2 2025-12-04T15:30:56.8214536Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764862256_3a754954d12611f0a707f22a29242828 2025-12-04T15:30:58.8403968Z /var/lib/jenkins/pytorch/tools/stats/upload_metrics.py:156: UserWarning: Error uploading metric td_test_failure_stats_v2 to DynamoDB: Unable to locate credentials 2025-12-04T15:30:58.8404581Z warn(f"Error uploading metric {metric_name} to DynamoDB: {e}") 2025-12-04T15:30:58.8405516Z Emitting td_test_failure_stats_v2 2025-12-04T15:30:58.8407467Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764862258_3ba96882d12611f0a707f22a29242828 2025-12-04T15:30:58.8426069Z Emitting td_test_failure_stats_v2 2025-12-04T15:30:58.8426576Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764862258_3ba9b4fed12611f0a707f22a29242828 2025-12-04T15:30:58.8443994Z Emitting td_test_failure_stats_v2 2025-12-04T15:30:58.8444475Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764862258_3ba9fb1cd12611f0a707f22a29242828 2025-12-04T15:30:58.8462228Z Emitting td_test_failure_stats_v2 2025-12-04T15:30:58.8462681Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764862258_3baa4270d12611f0a707f22a29242828 2025-12-04T15:30:58.8478440Z distributed/fsdp/test_fsdp_pure_fp16 1/1 failed! 2025-12-04T15:30:58.8478714Z distributed/fsdp/test_fsdp_apply 1/1 failed! 2025-12-04T15:30:58.8478971Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 failed! 2025-12-04T15:30:58.8479247Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 failed! 2025-12-04T15:30:58.8479493Z distributed/fsdp/test_fsdp_core 2/2 failed! 2025-12-04T15:30:59.5680731Z 2025-12-04T15:30:59.5680851Z real 156m30.041s 2025-12-04T15:30:59.5681017Z user 330m52.718s 2025-12-04T15:30:59.5681183Z sys 389m20.201s 2025-12-04T15:30:59.5681341Z + sccache_epilogue 2025-12-04T15:30:59.5681563Z + echo '::group::Sccache Compilation Log' 2025-12-04T15:30:59.5682370Z ##[group]Sccache Compilation Log 2025-12-04T15:30:59.5682630Z + echo '=================== sccache compilation log ===================' 2025-12-04T15:30:59.5682900Z =================== sccache compilation log =================== 2025-12-04T15:30:59.5683289Z + python /var/lib/jenkins/pytorch/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-12-04T15:30:59.5760241Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-12-04T15:30:59.5760756Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-12-04T15:30:59.5761030Z + sccache --show-stats 2025-12-04T15:30:59.5780366Z Compile requests 710 2025-12-04T15:30:59.5780572Z Compile requests executed 12 2025-12-04T15:30:59.5780784Z Cache hits 0 2025-12-04T15:30:59.5780980Z Cache misses 12 2025-12-04T15:30:59.5781165Z Cache misses (C/C++) 12 2025-12-04T15:30:59.5781350Z Cache hits rate 0.00 % 2025-12-04T15:30:59.5781539Z Cache hits rate (C/C++) 0.00 % 2025-12-04T15:30:59.5781723Z Cache timeouts 0 2025-12-04T15:30:59.5781910Z Cache read errors 0 2025-12-04T15:30:59.5782079Z Forced recaches 0 2025-12-04T15:30:59.5782260Z Cache write errors 0 2025-12-04T15:30:59.5782509Z Cache errors 0 2025-12-04T15:30:59.5782689Z Compilations 12 2025-12-04T15:30:59.5782866Z Compilation failures 0 2025-12-04T15:30:59.5783050Z Non-cacheable compilations 0 2025-12-04T15:30:59.5783241Z Non-cacheable calls 13 2025-12-04T15:30:59.5783415Z Non-compilation calls 685 2025-12-04T15:30:59.5783604Z Unsupported compiler calls 0 2025-12-04T15:30:59.5783807Z Average cache write 0.000 s 2025-12-04T15:30:59.5784001Z Average compiler 0.938 s 2025-12-04T15:30:59.5784179Z Average cache read hit 0.000 s 2025-12-04T15:30:59.5784377Z Failed distributed compilations 0 2025-12-04T15:30:59.5784509Z 2025-12-04T15:30:59.5784573Z Non-cacheable reasons: 2025-12-04T15:30:59.5784727Z -E 7 2025-12-04T15:30:59.5784899Z unknown source language 6 2025-12-04T15:30:59.5785019Z 2025-12-04T15:30:59.5785138Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T15:30:59.5785380Z Use direct/preprocessor mode? yes 2025-12-04T15:30:59.5785561Z Version (client) 0.10.0 2025-12-04T15:30:59.5785751Z Cache size 710 KiB 2025-12-04T15:30:59.5785933Z Max cache size 10 GiB 2025-12-04T15:30:59.5786114Z + sccache --stop-server 2025-12-04T15:30:59.5792862Z Stopping sccache server... 2025-12-04T15:30:59.5795050Z Compile requests 710 2025-12-04T15:30:59.5795378Z Compile requests executed 12 2025-12-04T15:30:59.5795618Z Cache hits 0 2025-12-04T15:30:59.5795840Z Cache misses 12 2025-12-04T15:30:59.5796064Z Cache misses (C/C++) 12 2025-12-04T15:30:59.5796297Z Cache hits rate 0.00 % 2025-12-04T15:30:59.5796535Z Cache hits rate (C/C++) 0.00 % 2025-12-04T15:30:59.5796770Z Cache timeouts 0 2025-12-04T15:30:59.5796998Z Cache read errors 0 2025-12-04T15:30:59.5797226Z Forced recaches 0 2025-12-04T15:30:59.5797452Z Cache write errors 0 2025-12-04T15:30:59.5797678Z Cache errors 0 2025-12-04T15:30:59.5797905Z Compilations 12 2025-12-04T15:30:59.5798136Z Compilation failures 0 2025-12-04T15:30:59.5798385Z Non-cacheable compilations 0 2025-12-04T15:30:59.5798621Z Non-cacheable calls 13 2025-12-04T15:30:59.5799075Z Non-compilation calls 685 2025-12-04T15:30:59.5799309Z Unsupported compiler calls 0 2025-12-04T15:30:59.5799547Z Average cache write 0.000 s 2025-12-04T15:30:59.5799790Z Average compiler 0.938 s 2025-12-04T15:30:59.5800026Z Average cache read hit 0.000 s 2025-12-04T15:30:59.5800354Z Failed distributed compilations 0 2025-12-04T15:30:59.5800513Z 2025-12-04T15:30:59.5800603Z Non-cacheable reasons: 2025-12-04T15:30:59.5800876Z -E 7 2025-12-04T15:30:59.5801109Z unknown source language 6 2025-12-04T15:30:59.5801259Z 2025-12-04T15:30:59.5801451Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T15:30:59.5801770Z Use direct/preprocessor mode? yes 2025-12-04T15:30:59.5802016Z Version (client) 0.10.0 2025-12-04T15:30:59.5802252Z Cache size 710 KiB 2025-12-04T15:30:59.5802496Z Max cache size 10 GiB 2025-12-04T15:30:59.5802733Z + echo ::endgroup:: 2025-12-04T15:30:59.5803099Z ##[endgroup] 2025-12-04T15:30:59.5855517Z ##[error]Process completed with exit code 1. 2025-12-04T15:30:59.5889111Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T15:30:59.5889431Z # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T15:30:59.5889804Z docker exec -t "e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test" 2025-12-04T15:30:59.5902475Z shell: /usr/bin/bash -e {0} 2025-12-04T15:30:59.5902595Z env: 2025-12-04T15:30:59.5902693Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:30:59.5902834Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:30:59.5903013Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:30:59.5903190Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:30:59.5903718Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:30:59.5904212Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:30:59.5904333Z AWS_REGION: us-east-1 2025-12-04T15:30:59.5904515Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:30:59.5904670Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:30:59.5906818Z AWS_SESSION_TOKEN: *** 2025-12-04T15:30:59.5906988Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:30:59.5907172Z ##[endgroup] 2025-12-04T15:30:59.6572970Z ##[group]Run docker exec -t "e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T15:30:59.6573382Z docker exec -t "e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T15:30:59.6578100Z shell: /usr/bin/bash -e {0} 2025-12-04T15:30:59.6578211Z env: 2025-12-04T15:30:59.6578301Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:30:59.6578435Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:30:59.6578606Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:30:59.6578766Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:30:59.6579281Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:30:59.6579768Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:30:59.6579880Z AWS_REGION: us-east-1 2025-12-04T15:30:59.6580081Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:30:59.6580287Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:30:59.6582419Z AWS_SESSION_TOKEN: *** 2025-12-04T15:30:59.6582585Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:30:59.6582764Z ##[endgroup] 2025-12-04T15:30:59.7435696Z ##[group]Run cat test/**/*_toprint.log || true 2025-12-04T15:30:59.7435860Z cat test/**/*_toprint.log || true 2025-12-04T15:30:59.7439303Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T15:30:59.7439536Z env: 2025-12-04T15:30:59.7439640Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:30:59.7439780Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:30:59.7439958Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:30:59.7440125Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:30:59.7440677Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:30:59.7441177Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:30:59.7441305Z AWS_REGION: us-east-1 2025-12-04T15:30:59.7441461Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:30:59.7441611Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:30:59.7443737Z AWS_SESSION_TOKEN: *** 2025-12-04T15:30:59.7443905Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:30:59.7444089Z ##[endgroup] 2025-12-04T15:30:59.7492734Z cat: 'test/**/*_toprint.log': No such file or directory 2025-12-04T15:30:59.7562499Z Prepare all required actions 2025-12-04T15:30:59.7562891Z Getting action download info 2025-12-04T15:31:00.1896188Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-12-04T15:31:01.0454344Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T15:31:01.8902522Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-12-04T15:31:01.8902675Z with: 2025-12-04T15:31:01.8902770Z use-gha: true 2025-12-04T15:31:01.8902926Z file-suffix: test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181 2025-12-04T15:31:01.8903104Z s3-bucket: gha-artifacts 2025-12-04T15:31:01.8903223Z env: 2025-12-04T15:31:01.8903314Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:31:01.8903455Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:31:01.8903634Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:31:01.8903816Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:31:01.8904327Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:31:01.8904828Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:31:01.8904948Z AWS_REGION: us-east-1 2025-12-04T15:31:01.8905115Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:31:01.8905267Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:31:01.8907477Z AWS_SESSION_TOKEN: *** 2025-12-04T15:31:01.8907650Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:31:01.8907829Z ##[endgroup] 2025-12-04T15:31:01.8938051Z ##[group]Run actions/upload-artifact@v4 2025-12-04T15:31:01.8938181Z with: 2025-12-04T15:31:01.8938376Z name: test-jsons-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip 2025-12-04T15:31:01.8938586Z retention-days: 14 2025-12-04T15:31:01.8938691Z if-no-files-found: warn 2025-12-04T15:31:01.8938801Z path: test/**/*.json 2025-12-04T15:31:01.8938909Z compression-level: 6 2025-12-04T15:31:01.8939008Z overwrite: false 2025-12-04T15:31:01.8939201Z include-hidden-files: false 2025-12-04T15:31:01.8939309Z env: 2025-12-04T15:31:01.8939401Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:31:01.8939532Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:31:01.8939704Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:31:01.8939866Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:31:01.8940404Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:31:01.8940967Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:31:01.8941089Z AWS_REGION: us-east-1 2025-12-04T15:31:01.8941222Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:31:01.8941369Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:31:01.8943541Z AWS_SESSION_TOKEN: *** 2025-12-04T15:31:01.8943707Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:31:01.8943882Z ##[endgroup] 2025-12-04T15:31:02.2809743Z With the provided path, there will be 6 files uploaded 2025-12-04T15:31:02.2812774Z Artifact name is valid! 2025-12-04T15:31:02.2813601Z Root directory input is valid! 2025-12-04T15:31:02.5074975Z Beginning upload of artifact content to blob storage 2025-12-04T15:31:02.8972969Z Uploaded bytes 44615 2025-12-04T15:31:02.9693173Z Finished uploading artifact content to blob storage! 2025-12-04T15:31:02.9694123Z SHA256 digest of uploaded artifact zip is deddd08ec9c6e785b6231a245ec60db4e6fbd0dbd28d0b27bcd5454bcc83cae0 2025-12-04T15:31:02.9695452Z Finalizing artifact upload 2025-12-04T15:31:03.1217422Z Artifact test-jsons-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip.zip successfully finalized. Artifact ID 4766024049 2025-12-04T15:31:03.1218427Z Artifact test-jsons-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip has been successfully uploaded! Final size is 44615 bytes. Artifact ID is 4766024049 2025-12-04T15:31:03.1222300Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4766024049 2025-12-04T15:31:03.1336485Z ##[group]Run actions/upload-artifact@v4 2025-12-04T15:31:03.1336640Z with: 2025-12-04T15:31:03.1336847Z name: test-reports-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip 2025-12-04T15:31:03.1337075Z retention-days: 14 2025-12-04T15:31:03.1337206Z if-no-files-found: ignore 2025-12-04T15:31:03.1337333Z path: test/**/*.xml test/**/*.csv 2025-12-04T15:31:03.1337460Z compression-level: 6 2025-12-04T15:31:03.1337584Z overwrite: false 2025-12-04T15:31:03.1337693Z include-hidden-files: false 2025-12-04T15:31:03.1337809Z env: 2025-12-04T15:31:03.1337902Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:31:03.1338045Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:31:03.1338230Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:31:03.1338413Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:31:03.1338932Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:31:03.1339431Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:31:03.1339555Z AWS_REGION: us-east-1 2025-12-04T15:31:03.1339744Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:31:03.1339906Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:31:03.1342167Z AWS_SESSION_TOKEN: *** 2025-12-04T15:31:03.1342343Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:31:03.1342528Z ##[endgroup] 2025-12-04T15:31:03.5540958Z With the provided path, there will be 756 files uploaded 2025-12-04T15:31:03.5543869Z Artifact name is valid! 2025-12-04T15:31:03.5544550Z Root directory input is valid! 2025-12-04T15:31:03.7732804Z Beginning upload of artifact content to blob storage 2025-12-04T15:31:04.4718580Z Uploaded bytes 631048 2025-12-04T15:31:04.5384217Z Finished uploading artifact content to blob storage! 2025-12-04T15:31:04.5385145Z SHA256 digest of uploaded artifact zip is 4baf908a2a8cdf274a5ff449beb4ab833576fce9fe7c5f50d523a2e1abdbe751 2025-12-04T15:31:04.5385799Z Finalizing artifact upload 2025-12-04T15:31:04.6933999Z Artifact test-reports-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip.zip successfully finalized. Artifact ID 4766024390 2025-12-04T15:31:04.6935962Z Artifact test-reports-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip has been successfully uploaded! Final size is 631048 bytes. Artifact ID is 4766024390 2025-12-04T15:31:04.6939732Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4766024390 2025-12-04T15:31:04.7075022Z ##[group]Run actions/upload-artifact@v4 2025-12-04T15:31:04.7075207Z with: 2025-12-04T15:31:04.7075406Z name: logs-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip 2025-12-04T15:31:04.7075629Z retention-days: 14 2025-12-04T15:31:04.7075760Z if-no-files-found: ignore 2025-12-04T15:31:04.7075894Z path: usage_log.txt test/**/*.log 2025-12-04T15:31:04.7076037Z compression-level: 6 2025-12-04T15:31:04.7076160Z overwrite: false 2025-12-04T15:31:04.7076286Z include-hidden-files: false 2025-12-04T15:31:04.7076431Z env: 2025-12-04T15:31:04.7076541Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:31:04.7076700Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:31:04.7077056Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:31:04.7077246Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:31:04.7077784Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:31:04.7078319Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:31:04.7078457Z AWS_REGION: us-east-1 2025-12-04T15:31:04.7078631Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:31:04.7078810Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:31:04.7081267Z AWS_SESSION_TOKEN: *** 2025-12-04T15:31:04.7081462Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:31:04.7081671Z ##[endgroup] 2025-12-04T15:31:05.1420993Z Multiple search paths detected. Calculating the least common ancestor of all paths 2025-12-04T15:31:05.1421894Z The least common ancestor is /home/runner/_work/pytorch/pytorch. This will be the root directory of the artifact 2025-12-04T15:31:05.1422172Z With the provided path, there will be 124 files uploaded 2025-12-04T15:31:05.1428062Z Artifact name is valid! 2025-12-04T15:31:05.1428605Z Root directory input is valid! 2025-12-04T15:31:05.4038003Z Beginning upload of artifact content to blob storage 2025-12-04T15:31:06.0127020Z Uploaded bytes 461791 2025-12-04T15:31:06.0780547Z Finished uploading artifact content to blob storage! 2025-12-04T15:31:06.0782188Z SHA256 digest of uploaded artifact zip is a54e52ee22e2bc1db90645d794e6a832b818a27ca0f0ded0ca438b5a2a2f5505 2025-12-04T15:31:06.0782618Z Finalizing artifact upload 2025-12-04T15:31:06.2306864Z Artifact logs-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip.zip successfully finalized. Artifact ID 4766024750 2025-12-04T15:31:06.2307731Z Artifact logs-runattempt1-test-distributed-3-3-linux.rocm.gpu.gfx942.4.b_57116213181.zip has been successfully uploaded! Final size is 461791 bytes. Artifact ID is 4766024750 2025-12-04T15:31:06.2312805Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4766024750 2025-12-04T15:31:06.2426882Z ##[group]Run # shellcheck disable=SC2156 2025-12-04T15:31:06.2427066Z # shellcheck disable=SC2156 2025-12-04T15:31:06.2427294Z find . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-12-04T15:31:06.2431864Z shell: /usr/bin/bash -e {0} 2025-12-04T15:31:06.2431987Z env: 2025-12-04T15:31:06.2432089Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:31:06.2432237Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:31:06.2432430Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:31:06.2432694Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:31:06.2433228Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:31:06.2433744Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:31:06.2433872Z AWS_REGION: us-east-1 2025-12-04T15:31:06.2434039Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:31:06.2434210Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:31:06.2436421Z AWS_SESSION_TOKEN: *** 2025-12-04T15:31:06.2436609Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:31:06.2436802Z ##[endgroup] 2025-12-04T15:31:06.3772458Z ##[group]Run actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 2025-12-04T15:31:06.3772662Z with: 2025-12-04T15:31:06.3772805Z name: coredumps-distributed-3-3-linux.rocm.gpu.gfx942.4.b 2025-12-04T15:31:06.3772969Z retention-days: 14 2025-12-04T15:31:06.3773083Z if-no-files-found: ignore 2025-12-04T15:31:06.3773200Z path: ./**/core.[1-9]* 2025-12-04T15:31:06.3773315Z compression-level: 6 2025-12-04T15:31:06.3773422Z overwrite: false 2025-12-04T15:31:06.3773531Z include-hidden-files: false 2025-12-04T15:31:06.3773657Z env: 2025-12-04T15:31:06.3773758Z GIT_DEFAULT_BRANCH: main 2025-12-04T15:31:06.3773903Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T15:31:06.3774086Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T15:31:06.3774257Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T15:31:06.3774778Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T15:31:06.3775278Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T15:31:06.3775401Z AWS_REGION: us-east-1 2025-12-04T15:31:06.3775566Z AWS_ACCESS_KEY_ID: *** 2025-12-04T15:31:06.3775728Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T15:31:06.3777894Z AWS_SESSION_TOKEN: *** 2025-12-04T15:31:06.3778079Z CONTAINER_NAME: e26df583e7f4bb508725d538ee16bf0bb710e66d463888c8c13af7907070676c 2025-12-04T15:31:06.3778266Z ##[endgroup] 2025-12-04T15:31:10.2197819Z No files were found with the provided path: ./**/core.[1-9]*. No artifacts will be uploaded. 2025-12-04T15:31:10.2368948Z Post job cleanup. 2025-12-04T15:31:10.2381986Z Post job cleanup. 2025-12-04T15:31:10.2578022Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T15:31:10.2766047Z Post job cleanup. 2025-12-04T15:31:10.3383884Z Post job cleanup. 2025-12-04T15:31:10.3404775Z Post job cleanup. 2025-12-04T15:31:10.3887816Z [command]/usr/bin/git version 2025-12-04T15:31:10.3916623Z git version 2.52.0 2025-12-04T15:31:10.3943986Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/0994218d-25dd-4930-b764-ff66fe0b6243/.gitconfig' 2025-12-04T15:31:10.3950524Z Temporarily overriding HOME='/home/runner/_work/_temp/0994218d-25dd-4930-b764-ff66fe0b6243' before making global git config changes 2025-12-04T15:31:10.3950933Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T15:31:10.3953461Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T15:31:10.3991255Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T15:31:10.4015915Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T15:31:10.4236762Z Entering 'android/libs/fbjni' 2025-12-04T15:31:10.4263472Z Entering 'third_party/FP16' 2025-12-04T15:31:10.4284672Z Entering 'third_party/FXdiv' 2025-12-04T15:31:10.4310952Z Entering 'third_party/NNPACK' 2025-12-04T15:31:10.4333760Z Entering 'third_party/NVTX' 2025-12-04T15:31:10.4357288Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T15:31:10.4382381Z Entering 'third_party/XNNPACK' 2025-12-04T15:31:10.4412287Z Entering 'third_party/aiter' 2025-12-04T15:31:10.4436083Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T15:31:10.4470437Z Entering 'third_party/benchmark' 2025-12-04T15:31:10.4494140Z Entering 'third_party/composable_kernel' 2025-12-04T15:31:10.4519974Z Entering 'third_party/cpp-httplib' 2025-12-04T15:31:10.4541750Z Entering 'third_party/cpuinfo' 2025-12-04T15:31:10.4573045Z Entering 'third_party/cudnn_frontend' 2025-12-04T15:31:10.4594848Z Entering 'third_party/cutlass' 2025-12-04T15:31:10.4619860Z Entering 'third_party/fbgemm' 2025-12-04T15:31:10.4644185Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T15:31:10.4669035Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T15:31:10.4699952Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T15:31:10.4721119Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T15:31:10.4758811Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T15:31:10.4798775Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T15:31:10.4838002Z Entering 'third_party/fbgemm/external/json' 2025-12-04T15:31:10.4863649Z Entering 'third_party/flash-attention' 2025-12-04T15:31:10.4889824Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T15:31:10.4916684Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T15:31:10.4947011Z Entering 'third_party/flatbuffers' 2025-12-04T15:31:10.4969812Z Entering 'third_party/fmt' 2025-12-04T15:31:10.5007402Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T15:31:10.5031750Z Entering 'third_party/gloo' 2025-12-04T15:31:10.5055637Z Entering 'third_party/googletest' 2025-12-04T15:31:10.5085127Z Entering 'third_party/ideep' 2025-12-04T15:31:10.5110446Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T15:31:10.5145719Z Entering 'third_party/ittapi' 2025-12-04T15:31:10.5168173Z Entering 'third_party/kineto' 2025-12-04T15:31:10.5191280Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T15:31:10.5217786Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T15:31:10.5240951Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T15:31:10.5263108Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T15:31:10.5284418Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T15:31:10.5306434Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T15:31:10.5329103Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T15:31:10.5349808Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T15:31:10.5370680Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T15:31:10.5393004Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T15:31:10.5418594Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T15:31:10.5449147Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:10.5481227Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:10.5516666Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T15:31:10.5550150Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T15:31:10.5593580Z Entering 'third_party/kleidiai' 2025-12-04T15:31:10.5629646Z Entering 'third_party/mimalloc' 2025-12-04T15:31:10.5655918Z Entering 'third_party/nlohmann' 2025-12-04T15:31:10.5681565Z Entering 'third_party/onnx' 2025-12-04T15:31:10.5710352Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T15:31:10.5739735Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T15:31:10.5763580Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T15:31:10.5795868Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T15:31:10.5820362Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T15:31:10.5844503Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T15:31:10.5868022Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T15:31:10.5890696Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T15:31:10.5919773Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T15:31:10.5941730Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:10.5970901Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:10.6004558Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T15:31:10.6042696Z Entering 'third_party/pocketfft' 2025-12-04T15:31:10.6069035Z Entering 'third_party/protobuf' 2025-12-04T15:31:10.6092057Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T15:31:10.6121596Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T15:31:10.6146531Z Entering 'third_party/psimd' 2025-12-04T15:31:10.6167976Z Entering 'third_party/pthreadpool' 2025-12-04T15:31:10.6188307Z Entering 'third_party/pybind11' 2025-12-04T15:31:10.6219023Z Entering 'third_party/python-peachpy' 2025-12-04T15:31:10.6243258Z Entering 'third_party/sleef' 2025-12-04T15:31:10.6266222Z Entering 'third_party/tensorpipe' 2025-12-04T15:31:10.6290651Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T15:31:10.6323458Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T15:31:10.6347219Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T15:31:10.6369506Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T15:31:10.6391328Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T15:31:10.6453136Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T15:31:10.6472980Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6486121Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T15:31:10.6504382Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T15:31:10.6679876Z Entering 'android/libs/fbjni' 2025-12-04T15:31:10.6700789Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6721376Z Entering 'third_party/FP16' 2025-12-04T15:31:10.6735000Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6752703Z Entering 'third_party/FXdiv' 2025-12-04T15:31:10.6769221Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6784885Z Entering 'third_party/NNPACK' 2025-12-04T15:31:10.6797833Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6821737Z Entering 'third_party/NVTX' 2025-12-04T15:31:10.6835068Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6853280Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T15:31:10.6866114Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6882853Z Entering 'third_party/XNNPACK' 2025-12-04T15:31:10.6894191Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6916353Z Entering 'third_party/aiter' 2025-12-04T15:31:10.6929213Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6955211Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T15:31:10.6969447Z http.https://github.com/.extraheader 2025-12-04T15:31:10.6998080Z Entering 'third_party/benchmark' 2025-12-04T15:31:10.7016121Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7032944Z Entering 'third_party/composable_kernel' 2025-12-04T15:31:10.7045148Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7069501Z Entering 'third_party/cpp-httplib' 2025-12-04T15:31:10.7083256Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7103318Z Entering 'third_party/cpuinfo' 2025-12-04T15:31:10.7117264Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7140530Z Entering 'third_party/cudnn_frontend' 2025-12-04T15:31:10.7157129Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7178362Z Entering 'third_party/cutlass' 2025-12-04T15:31:10.7193366Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7216612Z Entering 'third_party/fbgemm' 2025-12-04T15:31:10.7232713Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7249529Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T15:31:10.7264440Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7281782Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T15:31:10.7299093Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7327887Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T15:31:10.7350763Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7372344Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T15:31:10.7391371Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7415782Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T15:31:10.7437332Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7457289Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T15:31:10.7474835Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7499611Z Entering 'third_party/fbgemm/external/json' 2025-12-04T15:31:10.7517634Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7546608Z Entering 'third_party/flash-attention' 2025-12-04T15:31:10.7564035Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7583792Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T15:31:10.7596907Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7622880Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T15:31:10.7640388Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7666409Z Entering 'third_party/flatbuffers' 2025-12-04T15:31:10.7680555Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7699419Z Entering 'third_party/fmt' 2025-12-04T15:31:10.7713250Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7732326Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T15:31:10.7745383Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7761719Z Entering 'third_party/gloo' 2025-12-04T15:31:10.7773741Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7791563Z Entering 'third_party/googletest' 2025-12-04T15:31:10.7803971Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7820301Z Entering 'third_party/ideep' 2025-12-04T15:31:10.7832638Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7848444Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T15:31:10.7859901Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7886710Z Entering 'third_party/ittapi' 2025-12-04T15:31:10.7903688Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7921782Z Entering 'third_party/kineto' 2025-12-04T15:31:10.7934510Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7952419Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T15:31:10.7967740Z http.https://github.com/.extraheader 2025-12-04T15:31:10.7984845Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T15:31:10.7996262Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8015685Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T15:31:10.8029385Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8048282Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T15:31:10.8060772Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8078498Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T15:31:10.8089385Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8112340Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T15:31:10.8125081Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8143751Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T15:31:10.8157413Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8179291Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T15:31:10.8197404Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8215008Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T15:31:10.8236225Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8253876Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T15:31:10.8265491Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8287731Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T15:31:10.8300549Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8315812Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:10.8328475Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8352099Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:10.8371944Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8392507Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T15:31:10.8414525Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8437268Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T15:31:10.8461747Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8489769Z Entering 'third_party/kleidiai' 2025-12-04T15:31:10.8512524Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8539339Z Entering 'third_party/mimalloc' 2025-12-04T15:31:10.8553755Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8581619Z Entering 'third_party/nlohmann' 2025-12-04T15:31:10.8604830Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8624253Z Entering 'third_party/onnx' 2025-12-04T15:31:10.8637195Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8660489Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T15:31:10.8688263Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8714864Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T15:31:10.8734504Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8751812Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T15:31:10.8766114Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8781708Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T15:31:10.8796911Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8814004Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T15:31:10.8825335Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8842837Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T15:31:10.8867340Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8887048Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T15:31:10.8911081Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8935880Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T15:31:10.8950320Z http.https://github.com/.extraheader 2025-12-04T15:31:10.8971511Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T15:31:10.8993744Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9009794Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:10.9023292Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9046887Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:10.9065061Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9089071Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T15:31:10.9109002Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9141110Z Entering 'third_party/pocketfft' 2025-12-04T15:31:10.9156954Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9180461Z Entering 'third_party/protobuf' 2025-12-04T15:31:10.9193940Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9215213Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T15:31:10.9228979Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9247480Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T15:31:10.9263549Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9286101Z Entering 'third_party/psimd' 2025-12-04T15:31:10.9300331Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9320152Z Entering 'third_party/pthreadpool' 2025-12-04T15:31:10.9334094Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9349907Z Entering 'third_party/pybind11' 2025-12-04T15:31:10.9362033Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9378898Z Entering 'third_party/python-peachpy' 2025-12-04T15:31:10.9394740Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9414208Z Entering 'third_party/sleef' 2025-12-04T15:31:10.9427804Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9444044Z Entering 'third_party/tensorpipe' 2025-12-04T15:31:10.9457635Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9473716Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T15:31:10.9486217Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9503042Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T15:31:10.9518685Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9533736Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T15:31:10.9544609Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9561079Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T15:31:10.9572442Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9595342Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T15:31:10.9606705Z http.https://github.com/.extraheader 2025-12-04T15:31:10.9641794Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:10.9666128Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T15:31:10.9865640Z Entering 'android/libs/fbjni' 2025-12-04T15:31:10.9879986Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T15:31:10.9895185Z Entering 'third_party/FP16' 2025-12-04T15:31:10.9907254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T15:31:10.9916722Z Entering 'third_party/FXdiv' 2025-12-04T15:31:10.9929178Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T15:31:10.9942141Z Entering 'third_party/NNPACK' 2025-12-04T15:31:10.9956078Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T15:31:10.9966859Z Entering 'third_party/NVTX' 2025-12-04T15:31:10.9982787Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T15:31:10.9995170Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T15:31:11.0008094Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T15:31:11.0017590Z Entering 'third_party/XNNPACK' 2025-12-04T15:31:11.0029176Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T15:31:11.0046694Z Entering 'third_party/aiter' 2025-12-04T15:31:11.0061944Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T15:31:11.0072430Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T15:31:11.0083096Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.0100866Z Entering 'third_party/benchmark' 2025-12-04T15:31:11.0120518Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T15:31:11.0133892Z Entering 'third_party/composable_kernel' 2025-12-04T15:31:11.0145940Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.0161609Z Entering 'third_party/cpp-httplib' 2025-12-04T15:31:11.0172254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T15:31:11.0182174Z Entering 'third_party/cpuinfo' 2025-12-04T15:31:11.0197015Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T15:31:11.0210847Z Entering 'third_party/cudnn_frontend' 2025-12-04T15:31:11.0224492Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T15:31:11.0238896Z Entering 'third_party/cutlass' 2025-12-04T15:31:11.0250379Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T15:31:11.0265846Z Entering 'third_party/fbgemm' 2025-12-04T15:31:11.0285825Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T15:31:11.0299136Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T15:31:11.0308950Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T15:31:11.0320671Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T15:31:11.0334947Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.0349553Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T15:31:11.0363706Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T15:31:11.0373653Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T15:31:11.0386395Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T15:31:11.0400411Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T15:31:11.0412510Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T15:31:11.0422644Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T15:31:11.0433266Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T15:31:11.0442523Z Entering 'third_party/fbgemm/external/json' 2025-12-04T15:31:11.0456743Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T15:31:11.0468638Z Entering 'third_party/flash-attention' 2025-12-04T15:31:11.0478853Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T15:31:11.0491000Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T15:31:11.0503629Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.0516132Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T15:31:11.0529792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T15:31:11.0543779Z Entering 'third_party/flatbuffers' 2025-12-04T15:31:11.0557101Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T15:31:11.0567552Z Entering 'third_party/fmt' 2025-12-04T15:31:11.0582547Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T15:31:11.0592413Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T15:31:11.0603885Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T15:31:11.0616682Z Entering 'third_party/gloo' 2025-12-04T15:31:11.0629674Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T15:31:11.0640165Z Entering 'third_party/googletest' 2025-12-04T15:31:11.0653188Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:11.0663166Z Entering 'third_party/ideep' 2025-12-04T15:31:11.0674695Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T15:31:11.0683029Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T15:31:11.0695991Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T15:31:11.0711017Z Entering 'third_party/ittapi' 2025-12-04T15:31:11.0722746Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T15:31:11.0735146Z Entering 'third_party/kineto' 2025-12-04T15:31:11.0746503Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T15:31:11.0757363Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T15:31:11.0772922Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T15:31:11.0785190Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T15:31:11.0798803Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T15:31:11.0809131Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T15:31:11.0822821Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T15:31:11.0831957Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T15:31:11.0845955Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T15:31:11.0855593Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T15:31:11.0872024Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T15:31:11.0886401Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T15:31:11.0899988Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T15:31:11.0912174Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T15:31:11.0923945Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T15:31:11.0936275Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T15:31:11.0953362Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:11.0963283Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T15:31:11.0973479Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T15:31:11.0982398Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T15:31:11.0995314Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T15:31:11.1005048Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T15:31:11.1018648Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T15:31:11.1027970Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:11.1045760Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T15:31:11.1056267Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:11.1068860Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T15:31:11.1082434Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T15:31:11.1093690Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T15:31:11.1103260Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T15:31:11.1118250Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T15:31:11.1129151Z Entering 'third_party/kleidiai' 2025-12-04T15:31:11.1141449Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T15:31:11.1153103Z Entering 'third_party/mimalloc' 2025-12-04T15:31:11.1164367Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T15:31:11.1174946Z Entering 'third_party/nlohmann' 2025-12-04T15:31:11.1189168Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T15:31:11.1199665Z Entering 'third_party/onnx' 2025-12-04T15:31:11.1212352Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T15:31:11.1237913Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T15:31:11.1251454Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T15:31:11.1270823Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T15:31:11.1289202Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T15:31:11.1306101Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T15:31:11.1318143Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T15:31:11.1327948Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T15:31:11.1339226Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:11.1348020Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T15:31:11.1359727Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T15:31:11.1371361Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T15:31:11.1384183Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T15:31:11.1394908Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T15:31:11.1411691Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T15:31:11.1425822Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T15:31:11.1437475Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T15:31:11.1447564Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T15:31:11.1460644Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T15:31:11.1472379Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:11.1483202Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T15:31:11.1492341Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:11.1506580Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T15:31:11.1518689Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T15:31:11.1531097Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T15:31:11.1550401Z Entering 'third_party/pocketfft' 2025-12-04T15:31:11.1562581Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T15:31:11.1572629Z Entering 'third_party/protobuf' 2025-12-04T15:31:11.1584551Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T15:31:11.1595483Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T15:31:11.1605901Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T15:31:11.1618118Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T15:31:11.1633057Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:11.1645309Z Entering 'third_party/psimd' 2025-12-04T15:31:11.1664626Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T15:31:11.1679091Z Entering 'third_party/pthreadpool' 2025-12-04T15:31:11.1691705Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T15:31:11.1700999Z Entering 'third_party/pybind11' 2025-12-04T15:31:11.1719130Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T15:31:11.1729518Z Entering 'third_party/python-peachpy' 2025-12-04T15:31:11.1743100Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T15:31:11.1757146Z Entering 'third_party/sleef' 2025-12-04T15:31:11.1772022Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T15:31:11.1782606Z Entering 'third_party/tensorpipe' 2025-12-04T15:31:11.1793778Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T15:31:11.1808206Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T15:31:11.1819525Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:11.1829081Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T15:31:11.1840419Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T15:31:11.1849593Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T15:31:11.1860309Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T15:31:11.1869338Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T15:31:11.1881125Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T15:31:11.1895204Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T15:31:11.1905913Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T15:31:11.1937773Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.1961779Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.1979023Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.1997253Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2016274Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2032882Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2047672Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2064076Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2081670Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2099454Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2119781Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2139180Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2155327Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2176874Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2192893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2210019Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2224443Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2240514Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2263400Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2279595Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2295922Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2312105Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2329843Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2345916Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2362373Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2378623Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2400672Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2414878Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2431456Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2445570Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2461380Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2477469Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2493777Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2512142Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2533522Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2554608Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2574584Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2593252Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2610520Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2629637Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2646933Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2672544Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2688944Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2705578Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2722161Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2738989Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2755900Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2773074Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2790510Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2807317Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2823805Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2844402Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2861514Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2878331Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2895293Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2920580Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2928946Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2950662Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2968077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.2986788Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3004370Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3021041Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3049624Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3067326Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3084555Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3101501Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3118510Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3135316Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3151744Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3172825Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3192412Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3210650Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3233764Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3250999Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3268099Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3283621Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3300510Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3319679Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3336824Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3359816Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3384409Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.3506736Z Post job cleanup. 2025-12-04T15:31:11.3963788Z [command]/usr/bin/git version 2025-12-04T15:31:11.3992279Z git version 2.52.0 2025-12-04T15:31:11.4014626Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/d395cc42-b710-4763-8909-1521fdf1d35a/.gitconfig' 2025-12-04T15:31:11.4020365Z Temporarily overriding HOME='/home/runner/_work/_temp/d395cc42-b710-4763-8909-1521fdf1d35a' before making global git config changes 2025-12-04T15:31:11.4020714Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T15:31:11.4022995Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T15:31:11.4050629Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T15:31:11.4081842Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T15:31:11.4281838Z Entering 'android/libs/fbjni' 2025-12-04T15:31:11.4322138Z Entering 'third_party/FP16' 2025-12-04T15:31:11.4346922Z Entering 'third_party/FXdiv' 2025-12-04T15:31:11.4368284Z Entering 'third_party/NNPACK' 2025-12-04T15:31:11.4392372Z Entering 'third_party/NVTX' 2025-12-04T15:31:11.4422109Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T15:31:11.4446380Z Entering 'third_party/XNNPACK' 2025-12-04T15:31:11.4475076Z Entering 'third_party/aiter' 2025-12-04T15:31:11.4504333Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T15:31:11.4533074Z Entering 'third_party/benchmark' 2025-12-04T15:31:11.4571831Z Entering 'third_party/composable_kernel' 2025-12-04T15:31:11.4604072Z Entering 'third_party/cpp-httplib' 2025-12-04T15:31:11.4628972Z Entering 'third_party/cpuinfo' 2025-12-04T15:31:11.4655369Z Entering 'third_party/cudnn_frontend' 2025-12-04T15:31:11.4679772Z Entering 'third_party/cutlass' 2025-12-04T15:31:11.4706577Z Entering 'third_party/fbgemm' 2025-12-04T15:31:11.4738726Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T15:31:11.4771641Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T15:31:11.4803992Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T15:31:11.4829833Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T15:31:11.4862803Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T15:31:11.4890468Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T15:31:11.4923634Z Entering 'third_party/fbgemm/external/json' 2025-12-04T15:31:11.4957264Z Entering 'third_party/flash-attention' 2025-12-04T15:31:11.4988123Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T15:31:11.5019487Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T15:31:11.5060927Z Entering 'third_party/flatbuffers' 2025-12-04T15:31:11.5093057Z Entering 'third_party/fmt' 2025-12-04T15:31:11.5121662Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T15:31:11.5151938Z Entering 'third_party/gloo' 2025-12-04T15:31:11.5175885Z Entering 'third_party/googletest' 2025-12-04T15:31:11.5198464Z Entering 'third_party/ideep' 2025-12-04T15:31:11.5221812Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T15:31:11.5253256Z Entering 'third_party/ittapi' 2025-12-04T15:31:11.5286485Z Entering 'third_party/kineto' 2025-12-04T15:31:11.5319099Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T15:31:11.5351904Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T15:31:11.5381367Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T15:31:11.5407939Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T15:31:11.5440870Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T15:31:11.5468481Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T15:31:11.5494060Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T15:31:11.5519185Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T15:31:11.5542370Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T15:31:11.5568725Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T15:31:11.5599107Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T15:31:11.5624794Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:11.5654370Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:11.5686790Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T15:31:11.5715355Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T15:31:11.5748026Z Entering 'third_party/kleidiai' 2025-12-04T15:31:11.5780736Z Entering 'third_party/mimalloc' 2025-12-04T15:31:11.5805588Z Entering 'third_party/nlohmann' 2025-12-04T15:31:11.5832874Z Entering 'third_party/onnx' 2025-12-04T15:31:11.5867258Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T15:31:11.5896837Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T15:31:11.5922254Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T15:31:11.5950544Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T15:31:11.5973943Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T15:31:11.6003975Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T15:31:11.6032085Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T15:31:11.6061383Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T15:31:11.6089837Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T15:31:11.6121128Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:11.6151238Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:11.6178777Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T15:31:11.6213130Z Entering 'third_party/pocketfft' 2025-12-04T15:31:11.6241329Z Entering 'third_party/protobuf' 2025-12-04T15:31:11.6267607Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T15:31:11.6293821Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T15:31:11.6326402Z Entering 'third_party/psimd' 2025-12-04T15:31:11.6354341Z Entering 'third_party/pthreadpool' 2025-12-04T15:31:11.6377454Z Entering 'third_party/pybind11' 2025-12-04T15:31:11.6401180Z Entering 'third_party/python-peachpy' 2025-12-04T15:31:11.6424477Z Entering 'third_party/sleef' 2025-12-04T15:31:11.6448607Z Entering 'third_party/tensorpipe' 2025-12-04T15:31:11.6475659Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T15:31:11.6504523Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T15:31:11.6528443Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T15:31:11.6553965Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T15:31:11.6579275Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T15:31:11.6626690Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T15:31:11.6647349Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T15:31:11.6835574Z Entering 'android/libs/fbjni' 2025-12-04T15:31:11.6882027Z Entering 'third_party/FP16' 2025-12-04T15:31:11.6904449Z Entering 'third_party/FXdiv' 2025-12-04T15:31:11.6929719Z Entering 'third_party/NNPACK' 2025-12-04T15:31:11.6951668Z Entering 'third_party/NVTX' 2025-12-04T15:31:11.6976708Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T15:31:11.7002739Z Entering 'third_party/XNNPACK' 2025-12-04T15:31:11.7036066Z Entering 'third_party/aiter' 2025-12-04T15:31:11.7061875Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T15:31:11.7093413Z Entering 'third_party/benchmark' 2025-12-04T15:31:11.7117528Z Entering 'third_party/composable_kernel' 2025-12-04T15:31:11.7147145Z Entering 'third_party/cpp-httplib' 2025-12-04T15:31:11.7174966Z Entering 'third_party/cpuinfo' 2025-12-04T15:31:11.7195360Z Entering 'third_party/cudnn_frontend' 2025-12-04T15:31:11.7218248Z Entering 'third_party/cutlass' 2025-12-04T15:31:11.7242895Z Entering 'third_party/fbgemm' 2025-12-04T15:31:11.7266827Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T15:31:11.7288434Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T15:31:11.7321945Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T15:31:11.7353650Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T15:31:11.7378704Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T15:31:11.7402608Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T15:31:11.7422971Z Entering 'third_party/fbgemm/external/json' 2025-12-04T15:31:11.7451304Z Entering 'third_party/flash-attention' 2025-12-04T15:31:11.7474379Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T15:31:11.7500814Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T15:31:11.7525807Z Entering 'third_party/flatbuffers' 2025-12-04T15:31:11.7549899Z Entering 'third_party/fmt' 2025-12-04T15:31:11.7572350Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T15:31:11.7596245Z Entering 'third_party/gloo' 2025-12-04T15:31:11.7622163Z Entering 'third_party/googletest' 2025-12-04T15:31:11.7643805Z Entering 'third_party/ideep' 2025-12-04T15:31:11.7667542Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T15:31:11.7698790Z Entering 'third_party/ittapi' 2025-12-04T15:31:11.7722190Z Entering 'third_party/kineto' 2025-12-04T15:31:11.7745160Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T15:31:11.7765704Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T15:31:11.7786680Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T15:31:11.7813470Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T15:31:11.7839093Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T15:31:11.7860390Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T15:31:11.7882506Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T15:31:11.7910357Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T15:31:11.7936680Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T15:31:11.7969050Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T15:31:11.7995012Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T15:31:11.8019909Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:11.8042988Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:11.8066896Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T15:31:11.8092585Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T15:31:11.8115807Z Entering 'third_party/kleidiai' 2025-12-04T15:31:11.8147567Z Entering 'third_party/mimalloc' 2025-12-04T15:31:11.8174604Z Entering 'third_party/nlohmann' 2025-12-04T15:31:11.8195998Z Entering 'third_party/onnx' 2025-12-04T15:31:11.8224310Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T15:31:11.8255722Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T15:31:11.8287561Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T15:31:11.8310606Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T15:31:11.8335858Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T15:31:11.8361106Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T15:31:11.8382598Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T15:31:11.8403293Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T15:31:11.8424774Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T15:31:11.8446057Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:11.8472563Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:11.8500954Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T15:31:11.8547598Z Entering 'third_party/pocketfft' 2025-12-04T15:31:11.8579363Z Entering 'third_party/protobuf' 2025-12-04T15:31:11.8620438Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T15:31:11.8650542Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T15:31:11.8675125Z Entering 'third_party/psimd' 2025-12-04T15:31:11.8700711Z Entering 'third_party/pthreadpool' 2025-12-04T15:31:11.8737526Z Entering 'third_party/pybind11' 2025-12-04T15:31:11.8768570Z Entering 'third_party/python-peachpy' 2025-12-04T15:31:11.8799363Z Entering 'third_party/sleef' 2025-12-04T15:31:11.8831465Z Entering 'third_party/tensorpipe' 2025-12-04T15:31:11.8860345Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T15:31:11.8881425Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T15:31:11.8903727Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T15:31:11.8931893Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T15:31:11.8955786Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T15:31:11.9012087Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:11.9035842Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T15:31:11.9248479Z Entering 'android/libs/fbjni' 2025-12-04T15:31:11.9263060Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T15:31:11.9274030Z Entering 'third_party/FP16' 2025-12-04T15:31:11.9286942Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T15:31:11.9297705Z Entering 'third_party/FXdiv' 2025-12-04T15:31:11.9314570Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T15:31:11.9324946Z Entering 'third_party/NNPACK' 2025-12-04T15:31:11.9341128Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T15:31:11.9360135Z Entering 'third_party/NVTX' 2025-12-04T15:31:11.9376607Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T15:31:11.9387234Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T15:31:11.9402840Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T15:31:11.9415063Z Entering 'third_party/XNNPACK' 2025-12-04T15:31:11.9429750Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T15:31:11.9451294Z Entering 'third_party/aiter' 2025-12-04T15:31:11.9464747Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T15:31:11.9474242Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T15:31:11.9488072Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.9509650Z Entering 'third_party/benchmark' 2025-12-04T15:31:11.9523214Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T15:31:11.9531756Z Entering 'third_party/composable_kernel' 2025-12-04T15:31:11.9542514Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.9556360Z Entering 'third_party/cpp-httplib' 2025-12-04T15:31:11.9569228Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T15:31:11.9582099Z Entering 'third_party/cpuinfo' 2025-12-04T15:31:11.9593804Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T15:31:11.9610546Z Entering 'third_party/cudnn_frontend' 2025-12-04T15:31:11.9621821Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T15:31:11.9633089Z Entering 'third_party/cutlass' 2025-12-04T15:31:11.9645228Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T15:31:11.9659272Z Entering 'third_party/fbgemm' 2025-12-04T15:31:11.9671414Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T15:31:11.9683079Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T15:31:11.9702084Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T15:31:11.9712500Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T15:31:11.9726985Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.9740582Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T15:31:11.9752908Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T15:31:11.9766462Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T15:31:11.9777760Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T15:31:11.9790060Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T15:31:11.9800033Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T15:31:11.9808132Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T15:31:11.9818967Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T15:31:11.9826905Z Entering 'third_party/fbgemm/external/json' 2025-12-04T15:31:11.9836476Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T15:31:11.9849563Z Entering 'third_party/flash-attention' 2025-12-04T15:31:11.9861086Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T15:31:11.9871673Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T15:31:11.9881516Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T15:31:11.9893567Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T15:31:11.9903978Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T15:31:11.9923452Z Entering 'third_party/flatbuffers' 2025-12-04T15:31:11.9934704Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T15:31:11.9945040Z Entering 'third_party/fmt' 2025-12-04T15:31:11.9955175Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T15:31:11.9964369Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T15:31:11.9976562Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T15:31:11.9988571Z Entering 'third_party/gloo' 2025-12-04T15:31:12.0001389Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T15:31:12.0010942Z Entering 'third_party/googletest' 2025-12-04T15:31:12.0022388Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:12.0035322Z Entering 'third_party/ideep' 2025-12-04T15:31:12.0050137Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T15:31:12.0063802Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T15:31:12.0075670Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T15:31:12.0094188Z Entering 'third_party/ittapi' 2025-12-04T15:31:12.0107115Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T15:31:12.0117437Z Entering 'third_party/kineto' 2025-12-04T15:31:12.0128709Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T15:31:12.0138840Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T15:31:12.0150083Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T15:31:12.0159358Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T15:31:12.0175106Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T15:31:12.0186879Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T15:31:12.0200243Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T15:31:12.0209393Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T15:31:12.0228506Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T15:31:12.0237846Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T15:31:12.0257095Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T15:31:12.0266511Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T15:31:12.0282718Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T15:31:12.0292495Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T15:31:12.0309054Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T15:31:12.0318555Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T15:31:12.0333686Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:12.0342869Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T15:31:12.0360016Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T15:31:12.0369659Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T15:31:12.0389007Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T15:31:12.0398890Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T15:31:12.0408554Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T15:31:12.0417060Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:12.0428929Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T15:31:12.0445285Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:12.0458060Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T15:31:12.0471665Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T15:31:12.0489148Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T15:31:12.0499256Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T15:31:12.0513537Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T15:31:12.0525333Z Entering 'third_party/kleidiai' 2025-12-04T15:31:12.0542224Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T15:31:12.0553555Z Entering 'third_party/mimalloc' 2025-12-04T15:31:12.0570721Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T15:31:12.0581389Z Entering 'third_party/nlohmann' 2025-12-04T15:31:12.0596680Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T15:31:12.0609227Z Entering 'third_party/onnx' 2025-12-04T15:31:12.0620599Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T15:31:12.0650985Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T15:31:12.0661797Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T15:31:12.0674414Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T15:31:12.0685545Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T15:31:12.0696347Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T15:31:12.0710782Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T15:31:12.0719958Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T15:31:12.0733928Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:12.0742827Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T15:31:12.0756497Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T15:31:12.0766012Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T15:31:12.0777710Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T15:31:12.0788302Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T15:31:12.0800224Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T15:31:12.0811456Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T15:31:12.0823912Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T15:31:12.0834849Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T15:31:12.0849830Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T15:31:12.0859324Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T15:31:12.0875755Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T15:31:12.0885648Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T15:31:12.0900941Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T15:31:12.0916516Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T15:31:12.0930667Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T15:31:12.0948267Z Entering 'third_party/pocketfft' 2025-12-04T15:31:12.0959511Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T15:31:12.0970252Z Entering 'third_party/protobuf' 2025-12-04T15:31:12.0986293Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T15:31:12.0999065Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T15:31:12.1014851Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T15:31:12.1029261Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T15:31:12.1042726Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:12.1057851Z Entering 'third_party/psimd' 2025-12-04T15:31:12.1070644Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T15:31:12.1082434Z Entering 'third_party/pthreadpool' 2025-12-04T15:31:12.1095450Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T15:31:12.1109756Z Entering 'third_party/pybind11' 2025-12-04T15:31:12.1123941Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T15:31:12.1137190Z Entering 'third_party/python-peachpy' 2025-12-04T15:31:12.1153776Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T15:31:12.1167814Z Entering 'third_party/sleef' 2025-12-04T15:31:12.1179749Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T15:31:12.1190278Z Entering 'third_party/tensorpipe' 2025-12-04T15:31:12.1202824Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T15:31:12.1213885Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T15:31:12.1223169Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T15:31:12.1233438Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T15:31:12.1247041Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T15:31:12.1256382Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T15:31:12.1269073Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T15:31:12.1284589Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T15:31:12.1295348Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T15:31:12.1306461Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T15:31:12.1323021Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T15:31:12.1350053Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1373931Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1397030Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1413406Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1430756Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1446973Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1467700Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1485134Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1501333Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1516073Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1530912Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1544646Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1559486Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1574162Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1587952Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1602434Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1616546Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1632331Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1646495Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1662883Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1676501Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1693461Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1717005Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1722707Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1736027Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1754666Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1768900Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1784338Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1798426Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1813454Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1827411Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1841741Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1856545Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1871627Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1888609Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1902958Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1916009Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1929288Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1942446Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1955490Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1971880Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.1991280Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2004485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2021324Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2038462Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2053254Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2072357Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2086516Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2100550Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2113690Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2132331Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2147753Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2162256Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2178996Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2193698Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2210109Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2225039Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2241497Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2259976Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2281869Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2296430Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2310109Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2324910Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2344366Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2358425Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2374438Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2391173Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2406892Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2422311Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2440648Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2456178Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2472006Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2486756Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2502356Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2519041Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2536428Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2552072Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2568668Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2584543Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2601821Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2618347Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T15:31:12.2720814Z Cleaning up orphan processes